Excursions 
in Number Theory, 
Algebra, and 
Analysis 


D) Springer 


Undergraduate Texts in Mathematics 


Undergraduate Texts in Mathematics 


Readings in Mathematics 


Series Editors 


Pamela Gorkin 
Mathematics Department, Bucknell University, Lewisburg, PA, USA 


Jessica Sidman 
Mathematics and Statistics, Mount Holyoke College, South Hadley, MA, USA 


Advisory Board 

Colin Adams, Williams College, Williamstown, MA, USA 
Jayadev S. Athreya, University of Washington, Seattle, WA, USA 
Nathan Kaplan, University of California, Irvine, Irvine, CA, USA 
Lisette G. de Pillis, Harvey Mudd College, Claremont, CA, USA 
Jill Pipher, Brown University, Providence, RI, USA 


Jeremy Tyson, University of Illinois at Urbana-Champaign, Urbana, IL, 
USA 


Undergraduate Texts in Mathematics are generally aimed at third- and 
fourth-year undergraduate mathematics students at North American univer- 
sities. These texts strive to provide students and teachers with new perspec- 
tives and novel approaches. The books include motivation that guides the 
reader to an appreciation of interrelations among different aspects of the 
subject. They feature examples that illustrate key concepts as well as exercises 
that strengthen understanding. 


Kenneth Ireland - Al Cuoco 


Excursions in 
Number Theory, 
Algebra, and Analysis 


o) Springer 


Kenneth Ireland Al Cuoco 


(deceased) Education Development Center 
Fredericton, NB, Canada Waltham, MA, USA 

ISSN 0172-6056 ISSN 2197-5604 (electronic) 
Undergraduate Texts in Mathematics 

ISSN 2945-5839 ISSN 2945-5847 (electronic) 
Readings in Mathematics 

ISBN 978-3-031-13016-8 ISBN 978-3-031-13017-5 (eBook) 


https://doi.org/10.1007/978-3-03 1-13017-5 


Mathematics Subject Classification: 11Axx, 11Dxx, 11Mxx, 11Nxx, 26Axx, 26Bxx, 26Cxx, 
12Fxx, OOAxx, O1Axx, 11Cxx, 11Jxx, 11Lxx, 20Bxx, 20Dxx, 30xx 


© Springer Nature Switzerland AG 2023 

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or 
part of the material is concerned, specifically the rights of translation, reprinting, reuse of 
illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, 
and transmission or information storage and retrieval, electronic adaptation, computer software, 
or by similar or dissimilar methodology now known or hereafter developed. 

The use of general descriptive names, registered names, trademarks, service marks, etc. in this 
publication does not imply, even in the absence of a specific statement, that such names are 
exempt from the relevant protective laws and regulations and therefore free for general use. 
The publisher, the authors, and the editors are safe to assume that the advice and information in 
this book are believed to be true and accurate at the date of publication. Neither the publisher nor 
the authors or the editors give a warranty, expressed or implied, with respect to the material 
contained herein or for any errors or omissions that may have been made. The publisher remains 
neutral with regard to jurisdictional claims in published maps and institutional affiliations. 


This Springer imprint is published by the registered company Springer Nature Switzerland AG 
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland 


Micky: Ancora una volta, tutto quello che faccio, 
lo faccio per te. 


Preface 


History 


In 1972, I took a summer course from the late Ken Ireland at Bowdoin 
College. I was a new high-school teacher attending a four-summer NSF 
institute leading to a master’s degree. For close to fifty years, I had wanted to 
create a book for mathematics majors based on Ken’s course, his typed notes, 
and the accompanying experience of learning from his lectures and discus- 
sions during his office hours. Now I have done so, and you are holding it in 
your hands or reading it on a computer monitor. I had three reasons for 
writing this book: 


(i) As acapstone course for majors, it ties together much of undergraduate 

mathematics in ways that situate topics in the history of the subject and 

that make connections to major themes in the discipline. It is not that 

the mathematical topics have direct connections to the content of any 

particular course (although many do); rather, they provide valuable 

background that can be used to place that content in the broad land- 

scape of mathematics as a scientific discipline. One of Ken’s premises 

was that there are dozens of famous mathematical results that are part 

of the “folklore” of many of the courses that undergraduates take. For example, the fact 

Some of those results go back to the Greeks. Some come from arith- ae i eapseiane 
metic, number theory, and analysis, and some involve classical alge- _ precollege_ mathematics— 
bra. His course developed the mathematics needed to prove a large _ often stated but 


: never proved. So too 
collection of these results. auith the fundamental 


(ii) Most “content courses” for undergraduates use a classical structure and theorem of algebra. 
pedagogy; general results are developed with proofs, and then the 
results are applied to several concrete situations. This is a wonderfully 
efficient and elegant structure for presenting established results, but it 
misses some of the messiness and false starts that are so typical of 
doing (as opposed to learning) mathematics. Ken’s style was much less 
formal; it was based much more on filling in details and background 
for problem sets that provide practice with technique, to be sure, but 
also preview ideas that might not get nailed down until later in one’s 
mathematical career. This kind of immersion in doing mathematics has 
been one of the inspirations for my teaching and for the approach that 
my colleagues and I take to professional development for practicing 
teachers. The other is my collaboration with Glenn Stevens in Boston 
University’s PROMYS for teachers, also organized around experience 
before formality. I have seen how an immersion experience in math- 
ematics is a jump start for many people, helping them develop the 
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disposition and habits needed to make sense both of more traditionally 
organized courses and of mathematics in practice. 


(iii) In addition to a dive into classical content, this book gives students a 
real sense of mathematical culture—its history, its norms, and even its 
humor. The book is full of anecdotes about mathematicians, examples 
of milestones in the history of mathematics, and stories about life as a 
mathematician. Ken was a master at this, a consummate mathematician 
who loved to tell stories and make jokes. “Problem 17 is left out due to 
lack of space.” “Hint for Problem 23: See Problem 17.” My experience 
is that this kind of playfulness draws people into the culture. It cer- 
tainly had that effect on me. 


Some will look at the book and say that it has holes. And indeed it does, 
but that, too, is on purpose. It makes assumptions about what the reader 
knows (a little field theory, for example). Although I have triaged some 
of these, I think leaving things to the reader is important, because having to 
act on insufficient information reflects reality. We routinely face problems 
that arrive without having asked us what chapter we just read. I have included 
citations that help students look up material for themselves. 

The book is aimed at students who have the equivalent of the first two or 
three years of undergraduate mathematics, but it would work for students 
with less background if they have the disposition and drive to fill in some 
gaps. Instructors may need to (re)introduce some common notation—Z, Q, 
and so on. 

The Ireland course had a huge influence on my teaching and career, and 
this book is my attempt to preserve his work and share it with others. I hope 
that it will help students and readers bind together undergraduate studies and 
give them a big-picture view of the landscape they want to tour with their 
own students or in their future mathematical work. 


—AI| Cuoco 


From Ken’s Original Preface 


These notes represent a series of thirty lectures delivered to an enthusiastic 
and capable audience of sixty-four secondary school teachers. The purpose 
of the course was to acquaint the student with several mathematical struc- 
tures, their interrelationship and fundamental properties. At the same time, 
technique was developed through exercises. As the lectures proceeded, two 
hundred exercises were distributed so that the students could acquire 
manipulative skills and encounter the limitations in actual practice of general 
theory. The exercises cover special cases of Galois theory, Fourier series, 
field theory, symmetric functions, modular arithmetic, and so forth. Discus- 
sion resulting from the problems generated supplements, expanded chapters, 
and a deeper study of certain concepts. 

The fundamental theorem of algebra was examined from several points of 
view, and its algebraic-analytic aspect thoroughly discussed. An interest in 
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numbers not algebraic over the rationals led to a chapter on irrational and 
transcendental numbers. 

The use of orbits in elementary group theory shows, beyond a doubt, that 
reasonably sophisticated results can be obtained with very little effort. The 
Sylow theorem on existence of subgroups in a group, while appearing here at 
the end of Chapter 2, was proved toward the end of the course, as a result of 
interest generated by the problems. The underlying theme of Galois theory 
sprinkled through the lectures and exercises would have led to a proof of the 
fundamental theorem, had time permitted. 

The analytic portion of the lectures entered with the fundamental theorem 
of algebra and continued with an elementary treatment of classical Fourier 
series. After a tedious proof of the transcendence of z, it was refreshing to 
find simple regular expressions for this number arising from trigonometric 
expansions of simple functions. A sophisticated link with Chapter 2, on 
pentagons and modular arithmetic, is achieved by the evaluation of the 
classical Gauss sum, a sum of roots of unity appearing in the construction of 
regular polygons. The central result in the Fourier series chapter is the simple 
proof of convergence due to Dirichlet. The simplicity of this result is often 
obscured in texts that develop the theory more extensively or are primarily 
concerned with physical applications. 

The constant interplay between basic concepts and the frequent capturing 
of substantial results, along with the wonderful cooperation and enthusiasm 
of the members, of course, made the adventure a great pleasure for me. 

I wish to thank Bowdoin College and NSF for giving me the opportunity 
to work with the participants of the program. Special thanks also to Nancy 
MacDonald, who did a fantastic job of keeping abreast of the daily ream of 
lecture notes. 


—Kenneth Ireland 
August, 1972 


Preface 


Ken Ireland, 1937-1991 


Using This Book 


Using This Book 


For Readers 


This is not your grandmother’s mathematics book. Rather than a “dogmatic 
exposition of an established theory” [9], think of this book as a guided tour of 
some major themes in modern mathematics. It is based on a course given by 
Ken Ireland in 1972, so a great deal has happened since then. Most notably, 
the celebrated Fermat conjecture (that there are no triples (a, b,c) of positive 
integers such that a” + b” = c” for n > 2 is now a theorem, established in the 
1990s. And the use of computational technology (including computer alge- 
bra), now a mainstay of mathematics research, was in its infancy in the 
1970s. So I have added sections that address these and other advancements, 
but I have tried to stay faithful to the playful and parsimonious style of Ken’s 
original notes—typescript and handwritten notes that I have hung onto all 
these years. 

The book begins with a chapter called “Dialing In Problems.” It contains 
eight problem sets, each designed to help you “dial in” to several mathe- 
matical structures and theories. I can’t stress the following enough: 


These problems make up the heart of what you'll learn from this book. 
The chapters exist to support your work on the Dialing In problems. 


The sets are split into smaller numbered sections, each one prefaced with a 
note about the main themes developed in that set. Each set also contains 
problems that deepen earlier results and methods and preview upcoming 
ideas, often with special cases of theorems that will be proved later in more 
generality. The purpose of this setup is to develop interconnections—for 
example, connections between algebra and geometry or between analysis and 
arithmetic. 

If you are using the book as a text for a course, your instructor will have 
more to say about how the Dialing In problems are be assigned and used. 
Here’s my advice (from someone who has lived with these problems and 
used them with students for decades) about how they can be most useful: 


Look over each set before you dig in. Look for problems that look familiar and try 
them. And make a mental or real list of the problems that make no sense (because of 
either vocabulary or notation). If a problem makes you wonder or scratch your head, 
that’s good. The Dialing Ins are meant to be tried and tried again. Feel free to skip 
around and to revisit a problem after you have given it some time to percolate. 


While each section contains a mix of problems, each of them can be 
supported by chapters in the book: 


e Set 1 is a tour of some ideas that are developed in Chapter 2, a chapter that 
introduces some of the main structures and themes developed in the 
book—complex numbers, finite fields, group theory, and number theory. 
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Of course, you can use, this 
material in any way you 
like, but the preferred 
sequence is try, study, try 
again. 
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e Set 2 is supported by the ideas in Chapter 3, a chapter that develops 
unique factorization in Z and introduces formal Dirichlet series as a tool 
to investigate questions in arithmetic. 


e Set 3 deepens some of the ideas and methods in Chapter 3 and develops 
some concrete experience with the algebraic tools that are used in Chapter 4. 
If you like to calculate in old-fashioned algebra (as I do), this set is for you. 


e Set 4 offers more classical algebra, including a heavy dose of what was 
once called the theory of equations—expressing the roots of a polynomial 
equation as expressions in its coefficients. Chapter 4 provides support. 


e Set 5 revisits and deepens some algebraic and arithmetic ideas and results 
from the previous chapters—field theory, group theory, polynomial 
algebra, and arithmetic. It previews some ideas that will come up in 
Chapter 5 around irrational real numbers. 


e Set 6 introduces in earnest the analytic themes of the book (there are 
previews before this). Use Chapter 5 as a resource. In this chapter, you will 
study the proofs of the irrationality and transcendence of some classical 
constants that show up in precollege mathematics (such as z and e). 
In this set, you will see some of the ideas that motivate the proofs in 
Chapter 5, but you will also revisit some old friends from the previous 
chapters. 


e Sets 7 and 8, the last two Dialing Ins, cover a wide swath of beautiful 
mathematics, folding Fourier series into the mix of algebra, number 
theory, and analysis, with lots of trigonometry and beautiful series cal- 
culations for added measure. Use the last two chapters of the book as a 
resource. 


After you have worked through a chapter or two, look back at the Dialing 
Ins. Just think about what you have learned. It will make you smile. 

There are also exercises at the end of most sections. These are closely tied 
to the sections themselves, and they provide practice and extensions of some 
of the ideas introduced in the section. Some of them are previewed in the 
Dialing In sets. 

In addition to the Supplement sections that Ken mentions in his intro- 
duction, there are also sections labeled Lookout Point and Take It Further. 
These are digressions into related topics or deeper dives into the ideas pre- 
sented in the chapter. Many of them are accompanied by citations to other 
works. They can often be skipped, but doing so would be a shame. 


The main point I want to make is that this book is an invitation to do 
mathematics. Yes, you’ll learn about many results and develop many tech- 
niques, all curated by a brilliant mathematician (Ken, not me) and teacher. 
But what is more important is that you will experience the thrill of your own 
mathematical thinking. 


About prerequisites: I want to say that there are none, except for the drive 
and stamina needed to work on hard problems and ideas. More realistically, 
the formal prerequisites for this book are standard courses in abstract algebra 
and real analysis. However, readers without these courses, do not despair! 


Using This Book 


The process of learning mathematics need not always be linear; with a spirit 
of inquiry and some extra reading, the background material can be acquired 
as you go along. If you come across a term or result that you haven’t 
encountered before (or perhaps you have, but it didn’t stick), look it up. 
Internet searches of the kind we have now didn’t exist when I started 
teaching, but now they are ubiquitous. (Warning: The quality of search 
results varies wildly.) Keith Conrad’s website [12] is a wonderful resource, 
beautifully written and thorough. There are others, and if you are using this 
for a text or resource in a course, your instructor will surely have a stash of 
good references. As for published books, most of what is needed here is 
developed in [19, 41, 53, 70]. 

And finally, spending as much time as you can working on the Dialing In 
problems, looking up terms or results when they do not make sense, will 
enhance your enjoyment and understanding of the expository parts of the 
text. 


I'll bet that a great deal of this preface makes little sense right now. Think 
of it as a kind of Dialing In. After you have worked through a couple of 
chapters, come back and try again. 


For Instructors 


The advice in the section above was for readers of the book. But for instructors 
who are thinking about using it as a text or resource for a course, a piece of that 
advice still holds: this is not your grandmother’s mathematics text. 

A central feature of the design is the Dialing In problem sets. They are 
based on problem sets that were handed out in waves during the original 
course, roughly one set (15-20 problems) each day or two. They contained 
reviews, previews, and études. Like the celebrated Ross-PROMYS [81] and 
PCMI [82] problem sets, students should not (and will not) be able to 
complete all of the problems when they are first introduced. So to distinguish 
them from the section-specific Exercises, I have numbered them consecu- 
tively, 1-200, and placed them in the first chapter, sectioned off, with advice 
about how to use them. Each Dialing In problem is a kind of open invitation 
to explore an idea or a connection among ideas. Students will get the most out 
of the book if they spend most of their time Dialing In before ideas are 
elaborated in lecture, and of course afterward as well. Indeed, in the original 
course, lectures were often inspired by ideas that arose from students’ 
attempts to approach this or that problem. It would be fun for you and your 
students to build an occasional lecture that is a riff on what students have 
done in the Dialing In sets. 

Dialing In supports an approach to learning mathematics that my col- 
leagues and I call experience before formality. In the last decade, this design 
principle has gained traction among instructors at all levels. It goes by several 
names these days, each a variation on the theme of active learning, 
inquiry-based learning, exposure before closure, and many others. Teachers 
at all levels who have taught in this way cite many benefits; one of the most 
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And by the way, many 

of the citations used in the 
chapters refer to other 
books, some very old. If 
you don’t have ready 
access to some of these, 
you can use the good old 
internet. 


The Epilogue (which 
follows Chapter 6) will 
say more about this 
interplay between “doing 
and studying.” 


A note on notation: to avoid 
a digression into the mean- 
ing of Z/pZ, Z, stands 

for the field with 

a prime number, p, of 
elements, identified as 

the field of integers modulo 
p, even though it often 
stands for the p-adic 
integers. I hope this doesn’t 
irritate anyone. 


Even the “flipped class- 
room” model can be made 
to fit into this genre. 
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If cuts have to be made, 
consider the fact that 
although algebra, analysis, 
and number theory 

rear their heads throughout, 
the front half of the 

book is more algebra than 
analysis, and then things 
switch in Chapter 4. 


Many of the alumni of 
those NSF institutes 
(including the Bowdoin 
program) contributed to 
the directions that 
high-school mathematics 
took throughout the last 
century. 


For 50-minute classes, 
some days could be 
mostly student work and 
some could be mostly 
lecture. 


Chapter 5 is another 

good choice for this 
audience, but it may be a 
little steep for students 
without solid algebra and 
analysis backgrounds. 
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salient is that students retain what they learn and use it in ways that are 
faithful to how mathematics is done. 

The Ireland course was decades ahead of its time in this regard, and I still 
remember how unsettling it was for many teachers in the class (“why doesn’t 
he just tell us how to do it?”). Working intensely on problems in tandem with 
learning from lectures and explanations has had a profound effect on my own 
approach to teaching and learning. And I am convinced that before grad 
school, all of the mathematics I really understood (rather than simply learned) 
had its roots in the experience before formality structure of Ken’s course. 

This book is based on an intense residential course for practicing 
high-school teachers that ran for six weeks, two hours a day, five days a 
week. That model is perfect for this material, but what about the more 
realistic situations faced by most faculty? Here are some thoughts, gathered 
from my experience and that of colleagues. 


(i) Many states now require that teachers enroll in a content-based mas- 
ter’s program early in their careers. The NSF teacher institutes in the 
1960s and 70s were also aimed at early-career teachers. This material 
would fit perfectly with such programs, as a full course or as selections 
curated from the text (see item iv below for examples of curations). 

(ii) For all the reasons listed in the preface, this material would make a 
great capstone course for mathematics majors. But don’t be fooled by 
the fact that this is a thin book. It covers a wide swath of modern 
mathematics, and it treats it in depth with a minimum of pedantry. If 
students really dig into the problems as they learn the material, the 
book can easily fill a semester. For example, a semester course that 
meets twice a week for 80-minute sessions might go like this: 


e Weekend homework is the Dialing In for the upcoming week. 


e Tuesday: class begins with table/small group discussions of Dialing 
In problems (15-20 minutes). Then 


— 20-25 minutes of lecture 
— 30 minutes of work on the Exercises specific to that section 
— 10 minutes of reflection and presenting 


e Homework for Thursday: work on the remaining Exercises and 
Dialing In problems. 


e Repeat 


This structure can be easily adapted to a flipped classroom by 
switching out the lecture for a preclass video. 


Selections from the book are ideal fodder for seminars for advanced 
students. A good example is Chapter 4, which offers different takes on 
the fundamental theorem of algebra. Especially for preservice teachers, 
this is a valuable piece of background for talking about historical 
developments and different approaches. The fundamental theorem of 
algebra lives in the folklore of high school, and without an 
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understanding of the essential “analytic step,” it is all too easy (as 
some books do) to fall into the trap of Exercise 4.12. 


Sections of the book could enhance a number of standard under- 
graduate courses: 


(iv) 


e Sections 2.1—2.3 and 3.1—3.4 fit nicely into an elementary number 
theory course. 


e Sections 2.3—2.6 and 3.1—3.3 make ideal units for a first abstract 
algebra course. 


e Sections 5.1—5.5 make a good basis for an extended project in an 
analysis course. 


These are just examples of how the book can be used as a flexible resource 
for many undergraduate courses. My hope is that however you use these 
materials, you will find in them ways to fuel your imagination and that of 
your students. 
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And the Dialing In sets 
are a well-crafted store 

of problems that will fit in 
a wide variety of 

settings. 


I keep lobbying the 
Irelands to publish some 
of their stories in some 
way. They would be a real 
inspiration to the next 
generation of mathematics 
teachers at all levels. 


Our daughter Alicia was 
less than a year old 
when we packed up our 
1970 VW and headed 
for Bowdoin. 


The Lookout Point 
design element (see 
page 17 for the first 
of these) was Loretta’s 
idea. 
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board for many of the 
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Loretta enough for her insight into how we could make this usable while 
preserving its core organizing principle that asks students to try things before 
they are formally introduced. 

Loretta introduced me to David Kramer, and this was another source of 
unexpected good luck for me as the project came to a close. David is one of 
those people who do not fit into any of the standard categories. He is, at once, 
a mathematician (number theory, of course), a talented writer (with a won- 
derful sense of style), a proofreader (with an amazing eye for detail), an editor 
(with an ear for clarity and precision), a translator (see, for example, [6]), and 
so much more. If you could compare my penultimate draft of this book with 
what it became after David did his work, you would see what I mean. 

My colleagues and friends Paul Goldenberg and Sarah Sword were always 
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gested the sample semester schedule on page xviii; Paul showed me ways to 
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were hand drawn in Ken’s original notes. Throughout, they read drafts and 
made everything they touched much better. 

John Ewing read the entire set of typescript notes that Ken produced for the 
course. John saw what I saw and almost insisted that I get this project underway. 
John knows exactly the kinds of teachers that this material will excite. 

The team at SPi Global created the TEX source from the original typewritten 
notes created by Nancy MacDonald in 1972. The folks at SPi were extremely 
patient with me. And Yongyuan Huang, a brilliant undergraduate at Boston 
University, helped me navigate the vagaries and potholes involved in BrsTpx. 

And as always, none of this work would have been possible without the 
love and support of my dear Micky. 


Kenneth Ireland 
Al Cuoco 
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Recollections from Colleagues and Friends 


From Michael Rosen 


I began my career as an instructor of mathematics at Brown University in 
September 1962. Ken came to Brown as an assistant professor a few years 
later. Soon after his arrival on campus, we became friends, partly because we 
had many interests in common, e.g., number theory, algebraic geometry, 
literature, poetry, and classical music. 

For a while we shared an office. I soon discovered that Ken was a fas- 
cinating individual with a keen intellect, a wonderful dry sense of humor, and 
a zest for life. I found Ken to be one of the most engaging and interesting 
people I would ever know. 

He came to Brown after receiving his Ph.D. degree from Johns Hopkins 
University under the direction of Bernard Dwork. He spent a year or two at 
Brandeis University before coming to Brown. As it happened, I received my 
BA degree from Brandeis in 1959. Our paths were destined to cross early and 
often. 

At first, Ken and I shared an office on the first floor of Howell House, a 
rickety old building that housed the Mathematics Department. Ken imme- 
diately noticed that our office could easily accommodate a ping-pong table. 
We rented and installed such a table and soon became very popular with our 
colleagues, who would show up often for “a game or two.” This was great 
fun, but not at all conducive to serious work. We soon abandoned both our 
office and the ping-pong table for two small individual offices on the third 
floor. Our productivity immediately increased. 

Ken brought to my attention a short, but very influential, paper by the 
world-famous mathematician André Weil. The title of Weil’s paper was 
“Number of Solutions of Equations in a Finite Field.” Toward the end of this 
paper, Weil formulated three conjectures that became the focus of research in 
arithmetic-algebraic geometry for many years. The first conjecture was 
proved by Ken’s thesis advisor, Bernard Dwork. The second was proved 
primarily by Alexander Grothendiek, and the third, the Riemann hypothesis 
for varieties over finite fields, was proved by Pierre Deligne. Weil’s paper 
gave evidence for his conjectures by showing that they were true in special 
cases. All these conjectures were proved with the use of new and very 
difficult theorems in algebraic geometry. These were far from being acces- 
sible to beginners. However, we noted that the results in Weil’s paper were 
relatively elementary and could be made understandable to advanced 
undergraduates and beginning graduate students in mathematics. Thus was 
born our project of writing a number theory book with the goal of presenting 
all the background needed to be able to read and understand Weil’s paper 
discussed above. Our first book in this direction was Elements of Number 
Theory: Including an Introduction to Equations over Finite Fields. It was 
published by Bogden and Quigley in 1972. Subsequently, Bogden and 
Quigley went out of business. However, Springer-Verlag agreed to publish a 
greatly expanded version of our book with the title A Classical Introduction 
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to Modern Number Theory [41]. This was published in 1982. A second 
edition appeared in 1990. Unfortunately, before the second edition appeared, 
Ken Ireland passed away, suddenly, and prematurely, in 1991. 

The University of New Brunswick, where Ken taught from 1971 to 1991, 
established an annual lecture series in his honor entitled “The Ken Ireland 
Memorial Lecture.” I was very pleased to have been asked to give the first 
lecture in this series, on November 2, 1992. The title of my talk was “Niels 
Henrik Abel and Equations of the Fifth Degree.” 

I have many fond memories of our relationship. When we were both at 
Brown, I would occasionally visit the Math Department after dinner to work 
in my office. On entering the building I would often find Ken, alone in the 
Common Room, playing classical music on his flute. The beautiful music 
would follow me up the stairs. He was a truly unique and memorable indi- 
vidual. To this day, I still miss him. 


—NMichael Rosen, Professor Emeritus, Brown University 


From Ken Ribet 


When I arrived at Brown University in 1965, I thought that I would major in 
mathematics but had no sense of what a mathematics student might study in 
college. I was turned on to abstract mathematics by my first professors— 
Frank M. Stewart and Allan H. Clark. As Clark’s course on abstract algebra 
was coming to a close, I met Ken Ireland by chance in the Brown math 
building (Howell House). After I told Ireland what I was studying, he 
challenged me to cite examples of abelian Galois extensions. I replied 
immediately that extensions of finite fields were cyclic. Ireland agreed that 
this was the case but then asked me about number fields. Cyclotomic 
extensions were on his mind. At the end of our discussion, Ireland suggested 
that I read an article about roots of unity. He handed me André Weil’s 
celebrated 1949 article “Numbers of Solutions of Equations in Finite Fields.” 
This was the article that corresponds to the Weil conjectures about the 
cohomology of algebraic varieties over finite fields. The article is totally 
elementary (and very clearly written); I could follow every word. I returned 
to see Ireland not long after our initial encounter and reported that I had 
finished Weil’s paper. “Great! Now read this one.” Ireland was asking me to 
study Weil’s “Jacobi Sums as ‘Gréssencharaktere,”’ which I was able to do, 
more or less. (I had no sense that Weil was computing the Hasse—Weil zeta 
function in special cases.) 

As aresult of the apparent success of my independent study, Ireland offered 
to direct the senior thesis, which I wrote during my last year at Brown. He was 
not shy about making recommendations: he told me what literature to read, 
where to apply to graduate school, and who my advisor should be at Harvard 
—the school that he recommended for my graduate study. 

Although Ken Ireland seemed to have a mental block against doing 
research on his own, he devoured preprints on all sorts of topics and orga- 
nized research-level seminars for discussion of the most interesting papers. 
He was a marvelous classroom teacher, and he had a great sense of humor. 


Recollections from Colleagues and Friends xix 


He began the graduate algebra course that I attended with the statement that 
the definition of a group would be left as an exercise. While visiting a 
middle-school class, he asked the children in front of him to define a point in 
mathematics. A kid walked up to the front of the room and drew a dot on the 
blackboard. Ireland squinted at the dot and proclaimed, “That’s no point. It 
looks like a pile of chalk.” 


—Kenneth A. Ribet, University of California, Berkeley 


From the University of New Brunswick 


Ken Ireland joined the Department of Mathematics and Statistics at the 
University of New Brunswick, Fredericton, in 1971. He immediately became 
one of its most prominent members in scholarship, teaching, and collegial 
decision-making, remaining a leading light throughout his two decades with 
us. A colleague recalls that “Ken was a penetrating thinker who could go to 
the heart of a problem and solve it with elegant style and transparent rigor.” 
His command of mathematics was exceptionally broad and deep, enabling 
him to assist colleagues in diverse fields and facilitate their research work, 
while keeping abreast of developments well beyond his own immediate fields 
of number theory, algebra, algebraic geometry, and analysis. A colleague 
working in analysis recalls, “Ken’s example motivated me ... to improve.” 
An applied mathematician recalls successful collaborations with Ken on 
projects in computational number theory. 

He was an outstanding, truly gifted teacher. He inspired the students in his 
classes and was exceptionally generous with his time assisting students, not 
only those enrolled in his own courses but any students who approached him 
for help in mathematics. Many students in mathematics courses are appre- 
hensive and lack confidence in their own abilities. A colleague recalls how 
Ken helped them overcome these issues by “offering respect to them and 
receiving it back,” and by breaking tension with the use of humour that 
helped such students to relax and focus on course topics. Another recalled 
that he also was a generous mathematical mentor to talented undergraduate 
and graduate students. 

A superb expositor in general and frequent contributor of talks in the 
department’s weekly research seminar series, he inspired others by his 
example. These talents were brought to international audiences in the widely 
acclaimed book A Classical Introduction to Modern Number Theory, coau- 
thored with his long-time collaborator and friend Michael Rosen of Brown 
University. This text appeared in two editions, 1982 and 1990, the latter with 
additional chapters discussing research advances by prominent number the- 
orists during the intervening decade. Both written during his time at UNB, 
they made accessible to graduate students some of the most significant results 
of mid-twentieth-century research in the field. 

Ken actively engaged in recruitment of new faculty, insisting on excellent 
research and teaching, while also promoting equity and diversity. He was a 
leader in successful efforts to recruit the department’s first two women in 
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full-time faculty positions, making this a priority. He helped encourage both 
these highly qualified candidates to accept UNB’s offers by being knowl- 
edgeable about their specialties. One recalls, “Ken knew all about my thesis.” 

His interest in improving mathematical education in schools and experi- 
ence in summer NSF programs for school teachers in the United States led to 
securing a joint appointment between our department and UNB’s Faculty of 
Education. His knowledge and active engagement at UNB in this area, along 
with personal persuasiveness, were key factors in the two relevant deans and 
the vice-president agreeing to authorize the joint position. Both women that 
Ken helped recruit were granted tenure and made many significant contri- 
butions during their careers at UNB. 

Ken also was an inspiration in cultural and intellectual matters, encour- 
aging others to broaden their horizons. A dedicated amateur flautist, he 
occasionally gave public recitals, both solo and accompanied by other 
musicians. He preferred challenging scores, with J. S. Bach his favourite 
composer. He had wide literary interests, including novels and poems by 
Russian, German, and Austrian writers, which he read in the original lan- 
guages. His profound understanding of the history of mathematics shone in 
the course he regularly taught, which always attracted substantial enrollments 
of students in diverse degree programs. At the time of his death, Ken was 
well advanced in a manuscript on the history of reciprocity laws in number 
theory. It extended from the conjecture by Euler and first proofs by Gauss 
through the many generalizations and applications by others during subse- 
quent centuries. 

Ken’s life and work are honoured by the Ken Ireland Memorial Schol- 
arship (two are awarded annually to UNB undergraduates) and the annual 
Kenneth Ireland Memorial Lecture series delivered at UNB by distinguished 
mathematicians. Topics in the lectures have represented a wide range of 
mathematical fields, as may be illustrated by a partial list of speakers from the 
three decades of the series: Michael Rosen (Brown University—the first in 
the series), Yuri Bahturin (Lomonosov Moscow State University), Gilbert 
Strang (Massachusetts Institute of Technology), Nigel Higson (Pennsylvania 
State University), Kenneth Ribet (University of California, Berkeley), Kumar 
Murty (University of Toronto, the most recent in the series). 


—Bruce Lund, Gordon Mason, Barry Monson, Nora NiChuiv, 
Donald Small, and Jon Thompson. 
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Definition 

the field of real numbers 

the field of complex numbers 

the norm of the complex number z 


the unit circle 
2n 


cos 2 +isin= 
the field of rational numbers 

the ring of polynomials in x with coefficients in a field F 
the field obtained by adjoining the number « to F 

the ring of integers 

Euler’s phi function 


the field of integers modulo a prime p 
2n 


ae : Ona pak 
the minimal polynomial for cos <* + isin=* 
the degree of a field extension 
a is congruent to b modulo m 


Legendre symbol 


squares in Z,, 


h is a factor of n 

the number of left cosets of a subgroup G; in G 
the number of elements in G(S) 

the highest power of p that divides r 
QZ+aZ+ +--+ +a,Z 

the greatest common divisor of integers a and b 
the degree of polynomial f 

formal Riemann zeta function 

the field of p-adic numbers 


distance from the complex number z to 0 


XXV 


@) 


Check for 
updates 


Dialing In Problems 


In case you forgot or didn’t read (ahem) the purpose of the Dialing In problem 
sets in the preface, these are problems for you to try before (or in tandem with) 
your formal instruction or reading. They cover a wide range of topics. Some 
of them will not be familiar to you. But try them now, look things up (in this 
book, for example), and come back to them as you proceed through the text. 


1.1. Dialing In Set 1 


f 7 2 4 = 

1. Find the solutions in C to x" + 1 =0. In this book, Z,p stands for 

2. Find all primitives in Z17. the ring of integers modulo 
Pp 7 Dp, while Zs stands for the 

3. Show that nonzero elements of Zp. 


See Section 2.2.4 for the 


F A meaning of primitive. 
cos 4x = cos* x — 6 cos” x sin? x + sin’ x. eae 


4. Show that there are infinitely many primes p = 3 (mod 4). 


5. Find a group of order 16 such that every element different from the iden- For inspiration, how about 


tity has order 2 a group of order 4 in 

. which every nontrivial 
Find the subgroup of order 4 in Z7. element has order 2? 
Calculate the subgroup of cubes in Zjy and Z3,. 


Find all groups of order 4. 


oo ND 


Let E > F 2 K be three fields. Suppose that E is a vector space of What does it mean for 
dimension n over F, while F is a vector space of dimension m over K. _2,¥°'0" space to have 

: . ; dimension n over F'? 
Show that E is a vector space of dimension nm over K. 


10. Find an irreducible polynomial with rational coefficients that has /2+./3 
for a root. 


11. Show that if G is a cyclic group of order n, then there is exactly one 
subgroup of order m for each m | n. 


12. Show that if p is prime, then there is a polynomial in Z,, [x] with no root. 
13. Find an irreducible polynomial in Z2[x] of degree 4. 


14. Show that -3 is a square in Z, if and only if p = 1 (mod 3). Give 
examples. 


© Springer Nature Switzerland AG 2023 1 
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The intriguing question 
of representation by sums 
of squares is taken up in 
Section 3.4. Let us see 
what we can work out for 
ourselves with the tools 
we already have. 


Why a greatest common 
divisor and not the 
greatest? Stay tuned. 


See [19, Chapter 8] for 
more about Z[p]. 


“Formal Dirichlet series”? 


The representation func- 
tion r(n) is defined in 
Section 3.4. 


Chapter 1 Dialing In Problems 


1.2 Dialing In Set 2 


Welcome to a new installment of Dialing In. As usual, it contains some look- 
ing back and some previews of coming attractions. The previews include a 
visit with the following intriguing question: given an integer, in how many 
ways can it be expressed as the sum of two perfect squares? Try a few exam- 
ples and see what you can see. Looking back, we revisit some group and field 
theory. There’s plenty here to appeal to a wide variety of interests. So pick 
and choose and have fun. 


15. 


Factor 8 + 7i, 4+ 5i, 11 + 12: into irreducibles in Z[/]. 


16 Show that 


17. 
18. 


19. 


20. 


21. 


22. 
23. 


24. 


25. 


2 ai 4 5 6 


LHxtx7 4x7 4x54 xX°4+%x 


is irreducible in Q| x]. 
Find a greatest common divisor of 3 + 5i and 1 + 33. 


Define y(n) = 0 if n is even, y(n) = -1 if n = -1 (mod 4), and 
x(n) = +1 ifn = 1 (mod 4). Show that y(ab) = y(a) y(d). 


Show that if y is irreducible in Z[i], then so is 7. 

Apply the Euclidean algorithm in Z[i] to 

(a) a=1+i,B=6+5i, 

(b) a =443i, B=-147i. 

A subset o © Z[i] is called an ideal if 

(a) a, BeEo>a+fPheoand 

(b) aeo, yé Zi] > ayec. 

Show that every ideal in Z[i] is of the form 6- Z[i]. 

Develop the arithmetic of Z[p] = {a+bp | a,b Z}, where 1+p+p? = 0. 
Show that if y is irreducible in Z[i], than either 

(a) yy =p fora prime p = 1 (mod 4), or 

(b) y =uq for a prime g = 3 (mod 4) and a unit u € Z[i], or 
(c) y = u(1 -i) for some unit u € Z[i]. 


Define (1) = 1, u(n) = 0 if p* | n for some prime p, and p(p1---ps) = 
(-1)° if pi,..., ps are distinct primes. Show that ¥°q), u(d) =Oifn > 1, 
and conclude that 


as formal Dirichlet series. 


If (a,b) = 1, a,b € Z, a,b > 0, show that 4r(ab) = 4r(a)r(b). 


26. 


27. 


28. 


29. 


30. 


31. 
32. 


33. 
34. 


35. 
36. 


37. 
38. 
39. 


Dialing In Set 3 


Let f () be a real-valued function on 1, 2,3,... with f(ab) = f(a) f(b) 
when (a, b) = 1. Show that as a formal Dirichlet series, 


» = -T(; Soe) 


Consider 10 as an element of Z,. For example, 10 = 1 in Z3, 10 = 3 
in Z7, 10 = 10 in Z,.... Show that for p # 2,5, the order of 10 as 
an element of Z; is the length of the period in the repeating decimal 
expansion of 1/p. 

Show that x* + 7 is irreducible in Q[x]. Is it irreducible in Zs[x]? in 
Z\1 [x] ? 

Show that the only automorphisms of Q(i) leaving Q pointwise fixed are 
the identity map and 0 :a+bi>a- bi. 


Show that the only automorphisms of Q (V2) are the identity and o : 
a+bV1 > a-by2. 

Compute the Galois group of Q (¢ + ch), where £7 = 1,641. 

(a) If G and A are groups with operations + and © respectively, then 


G @ H is the set of pairs (g,h), g € G, h € H, with the operation 
(g,h)-(g',h') = (g * g’,h Oh’). Show that G ® H is a group. 


(b) Show that an abelian group of order 8 must be isomorphic to G2 © 
G2 ® G2 © G2 or G4 ® G2 or Gg, where G,, denotes the cyclic group 
with n elements. 


Find an irreducible polynomial of degree 3 over Z11. 


Write a +x} + Ke as a polynomial in the elementary symmetric functions 


OO = Xp + X02 +3, 02 = Xp X2 +X XZ + X2X3, 03 = X1X2X3 with coefficients 
inQ. 

What is the highest power of p dividing e a) for a prime p? 

Look up a primitive in Z4). Use it to solve the equations x° = 1 and 
= Lan Zaye 

Is the product of the first n primes plus | always prime? 

Show that there are infinitely many primes = 5 (mod 6). 


Use the fundamental theorem of arithmetic to show that \/31 is not ratio- 
nal. 


1.3 Dialing In Set 3 


This Dialing In set is mainly an algebraic excursion (although some analysis 
sneaks in). The algebra involves formal polynomial identities (arithmetic in 
Z|x], for example) and the connections to polynomial functions and equa- 
tions. Many of these problems look at results from high-school algebra in 
more general settings. Relax and have fun. And remember: you can pick and 
choose and then revisit (over and over). 


It isn’t known whether 
there are infinitely many p 
for which the order of 10 
is maximal, i.e., p — 1. 


It is not known whether 
there are infinitely many 
primes p for which 2 is a 
primitive in Zp. This is 
Emil Artin’s conjecture. 


This problem asks you 
to consider (once again) 
Exercise 3.8 in Sec- 
tion 3.1. 


Hint: Observe that 
x — @ divides x” — a” 
(proof?). Then write 


F(x) = f(x) - fle) 


In fact, you will show in 
Chapter 6 that cosnx € 


Z[cos x] (see Exercise 6.1, 


part ii). Or show it now— 
give it a try. 


40. 


41. 


42. 


43. 


44. 


45. 


46. 


47. 


48. 


49. 


50. 


eile 


32. 


Chapter 1 Dialing In Problems 
How many subgroups does Z3,, have? Write 397 as the sum of two 
squares. Did you know that 5 is the smallest primitive in Z397? 

The largest prime less than 4000 is 3989. Show that 2 is a primitive in 
Z3989.! 


Carry out the proof of the symmetric function theorem (Theorem 4.10) 
for the case of two variables. Does the proof significantly simplify? 


Consider the ring R[x]. Call two polynomials f and g equivalent if 
x’ + 1 divides f — g. We write f = g mod (x? + 1). Define multipli- 
cation and addition of equivalence classes (after you have shown that 
the relation between polynomials is an equivalence relation) and show 
that the resulting ring is a field isomorphic to C. 

Consider a fixed E with C 5 E > Q. If dimgE = 2, show that E = 
Q(./a) = {a+b V/a| a,b € Q} for some integer a. 

Let R be a commutative ring with identity. If f ¢ R[x] and f(a) = 0 
for some a € R, show that f(x) = (x - a@)h(x) for some h(x) € R[x]. 
(Note: You don’t have a division algorithm in general.) 

Show that the matrices of the form Oe : ) for a, b real form a subfield 
of the ring of all 2 x 2 matrices and that this subfield is isomorphic to C 
via the mapping 


( - : Joan ei 
What does the norm in Z[i] correspond to? 
Use the existence of a primitive to prove Wilson’s theorem: (p—1)! = -1 
in Zp. 
Suppose that f and g are polynomials of degree n in Q[x]. Show that if 


f and g agree on + | values, then they are equal as functions on Q. 


Construct a field E with p” elements that contains Zp for p = 3 (mod 4) 
by imitating the construction of C from R. Determine the automorphisms 
of E that leave Z, fixed. What group do you get? 


Show that if 2 is irreducible in Z[i], then N(z) is either prime or the 
square of a prime. 


Let f(x) be an irreducible polynomial in Q/x]. If a and @ are roots of 
f, show that the field Q(a) is isomorphic to Q(8) by an isomorphism 
that takes a@ to B. 


Use de Moivre’s theorem to show that for each positive integer n, one 
has 


sinnx € Z[sin x, cos x], 


cos nx € Z[sin x, cos x], 


where as usual, Z[a, 8] means polynomials in @ and f. 


'Don’t do this problem. [When Ken wrote this footnote, pocket calculators, let alone com- 
puters, were not widely available. Go ahead and solve the problem!] 
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64. 


65. 


66. 


Dialing In Set 4 


Let S be a set and G a group. Suppose each g € G is also a map from S$ 
to S and denote the image of s € S under g by g(s). Suppose e(s) = s for 
the identity e and g) - g2(s) = g1(go(s)) for each s, 91, go. Put G(s) = 
{ g(s) |g €G}. Show that either G(s) = G(s’) or G(s) N G(s‘) = @ for 
two elements s and s’ from S. 


Consider the cubic polynomial x* + x + 1. Find a polynomial in Q[ x] 
whose roots are the squares of the roots of the above polynomial. 


Consider x* + 4x? + 2x + 1 Zs[x]. Find the polynomial whose roots are 
Qa, + Q2, a1 + a3, and a2 + a3, where a1, @2, a3 are the roots of the above 
polynomial in some extension field EF of Zs. 


Write (x) — x2)? (x1 — x3)?(x2 — x3) as a polynomial in Z[o}, 02, 73], 
where in 0, 02, 3 are the three elementary symmetric functions in x,, 
X2, X3. 


2a 
12 


oy - os On = 2h gis 
Consider 212 = cos 45 + isin 7. Show that ¢), + gj, is not a rational 


number. 


Find the polynomial f € Q[x] of lowest degree such that f (£12 + 3!) 
=0. 


Is x+ + x? + 1 irreducible in Q[x]? 


Consider x? + ax? + bx +c = (x — @)(x — B)(x — y). Find the cubic 
polynomial with roots a+ B,a+y, 8+y. Here a, b, c, a, B, y belong to 
a field. 


Calculate, up to isomorphism, all groups with eight elements. 


Use the proof of the theorem on the primitive element to construct a 
primitive in Z)3. 

Consider f(x) = x3 + ax? +bx +c = (x-a@)(x — B)(x —y). Find the 
polynomial with roots a+ B+aB,~a+y+ay,andB+y+ By. 


Consider the additive groups in Zs and Z3. Show that Zs © Zz is isomor- 
phic to Zjs. 


(For those who need to review analysis.) Let E c Rx R with E compact. 
Show that if E ¢ Ug Vo, where V, is open and a an arbitrary index set, 
then there exist @),...,@, such that E c Vg, UVa, U-:-UVo,,. 


Show that the map x + |x| of C to R is continuous. 


1.4 Dialing In Set 4 


This set develops some of the algebraic background that will be useful in our 
algebraic approach to the fundamental theorem of algebra. And as usual, it 
revisits some earlier results and previews some of what is coming up later in 
the book. And as usual, pick and choose and come back often. 


67. 


If E > Q where E is a field of vector space dimension 3 over Q, show 
that E = Q(@), where @ is a root of an irreducible cubic in Q[ x]. 


Well, xetx+lis 
irreducible in Q[x], right? 


Exercises like 60 and 63 
give a taste of the sym- 
metric function theorem, 
which is coming up. 


Our definition of compact 
is in Section 4.2. 


One way to state the 
theorem is that every 
polynomial with real 
coefficients has a root in 
the complex numbers. 
More is true, as you will 
see. 


68. 
69. 
70. 


71. 


72. 


73. 


74. 
75. 


76. 
77. 


78. 
79. 


80. 


81. 


82. 


83. 
84. 


Chapter 1 Dialing In Problems 


Show that log, 7 is irrational. 
Show that V2 - V3 is irrational. 


Calculate the minimal polynomial for \/a + Vb, where a and b are 
square-free integers. 


Let A be an integral domain. If a ¢€ A, call @ irreducible if a = 86 implies 
that either £ or 6 is a unit in A. Show that if ¥ is an automorphism of A, 
then ‘¥(@) is irreducible if and only if @ is irreducible. 


One of the other problems is a special case of problem 71. Which one 
is it? 

Show that 1, \/2, /4 form a vector space basis for the field obtained by 
adjoining to Q the real root of x? — 2. 


Find a reducible polynomial over Q with all roots nonreal. 


Check out Wilson’s theorem explicitly for Zs, Z11, Z13, Z17, Z19. Do not 
cheat. 


This problem is omitted for lack of space. 


Consider Z,[x1,...,Xn]. Notice that x? + x? +---+ x? is symmetric. 
Pp 1 2 y 

According to the symmetric function theorem, it must be in ZplO1, 
., On]. Which element is it? 


True or False: problem 77 is a trick. The answer is immediate. 


Let ¥ be an automorphism of a ring R. Define ‘¥* on R[x] by 
W* (ag + ax t++++ anx") = (ag) + P(ay)x +--+ + V(ay)x”. 


If f and g are in R[x], show that ¥* (f-g) = '¥* (f)-¥* (g). 


Let a1,...,@, be indeterminates. Consider the polynomial 

[] Gj - (ai + a + aja;)) = H(z). 

i<j 
Show that the coefficients of H(z) are symmetric in a1,..., py. 
Let 


f(x) = x" $ an_1x” | +--+ + a9 € Q(x] 
= (x - 6))(x - 62)--(x - On). 


If g(x) € Q[x], show that []j_, (x — g(0;)) is in Q[x]. 
Factor x? + y? in Z,[x, y]. 
Show that if 0 <m <n, then M@—V--(oms)) ¢ 7, 


Show that x* + 6x? + 1 is irreducible over Q but reducible mod p for all 
primes p. 


85. 


86. 


87. 


88. 
89. 


90. 


91. 


Dialing In Set 5 
Show that if f is irreducible in Q[x], then in C[x] one has 


f(x) = (x — a)(x = Qn); 


where a; # a; fori + j. 


Consider F (a) /F, where a is algebraic over F. Let g(a) € F(a). Show 
that g(a@) is algebraic and the minimal polynomial of g(a) has degree 
dividing the degree of the minimal polynomial of a. 


Calculate the irreducible monic polynomial in Q| x] that has /-1+W—-2 
as a root. 


Find four real roots of x8 — 47x* + 1. 


Consider x? + ax + b = (x — a)(x — B)(x — 6). Express + +44 a in 


B 
terms of a and b. 


Show that for every real number N > 0, there exist consecutive primes p 
and g such that g- p> WN. 


Find an algebraic number @ such that a” ¢Q for all n > 0. 


1.5 Dialing In Set 5 


To finish up this algebraic tour, here is a collage of problems that revisits and 
extends some of the algebra we have used from group theory, ring theory, 
and polynomial algebra. It also previews some themes about irrationality in 
Chapter 5. Enjoy. 


92. 


93. 


94. 


95. 


96. 


97. 


98. 


99. 
100. 


Consider Z[i]. Show that if N(@) = p, where p is prime, then a: is irre- 
ducible. 


Show that the reflections and rotations of an equilateral triangle form a 
group of order 6 that is not abelian. 


Classify, up to isomorphism, all groups of order 6. 


Calculate the order of 10 mod p for the primes 7, 11, 13, 17, 19 and show 
that in each case, it is the length of the period in the repeating decimal 
expansion of 1/p. 


Consider the quartic x4+x+1 mod 2. Find the polynomial Z2[x] whose 
roots are the squares of the roots of x*++ x +1 (those latter roots being in 
some larger field containing Z2). 


Calculate the number of irreducibles mod p of degree 3 for low p. Any 
conjectures as to a general formula? 


Although there is space for problem 98, that space has been used by the 
present sentence (problems 99 to oo to follow). 


If K is a field, show that K[x] has unique factorization. 


Show that there are infinitely many nonisomorphic cyclic groups each 
having no proper subgroup other than the identity subgroup. 


Hint: Consider the 
derivative f’(x). 


Hint: Study “Some 
properties of algebraic 
extensions of fields” in 
Section 2.2. 


For problems 113 and 
114, consult the proofs of 
Theorems 5.4 and 5.5. 


And for 116, check out the 
proof of Theorem 5.8. 


101. 
102. 


103. 
104. 
105. 
106. 
107. 


108. 


109. 


110. 


Chapter 1 Dialing In Problems 


Show that there are infinitely many primes by considering n! + 1. 


Write out the proof of the symmetric function theorem (Theorem 4.7) 
for the special case of three variables. 


Find the minimal polynomial for \/2 + v/2. 

Show that \/2 + V2 is irrational. 

Write out carefully a proof that the set of algebraic numbers is countable. 
Construct a field with eight elements. 


Give an example of a nonabelian group with 2n elements for each posi- 
tive integer 7. 


Show that if G is a group with an even number of elements, then there is 
an element of order 2 in G. 


Show that if 3 divides the order of a finite group G, then there is an 
element of order 3. Don’t use any general theorems that automatically 
give the result (like Cauchy or Sylow). 


Show that x° + 6x7 + 18x? + 463104x + 1155 is irreducible in Q[x]. 


1.6 Dialing In Set 6 


Finally, we get to some analysis. Some of the problems ask you to fill in 
details in the proofs in Chapter 5. These proofs all use the same basic method 
(proof by contradiction) to show that a cleverly constructed function cannot 
exist. The purpose of many of the problems in this set is to show that the 
“cleverness” of these functions is no mystery—they are defined as a result 
of looking at concrete examples and abstracting off the properties needed to 
obtain a contradiction. There is algebra in here, too, just to mix things up. 
And some problems belong to more than one mathematical field. 


111. 
112. 
113. 


114. 


115. 


116. 


Show that every rational number has a repeating decimal expansion. 
Show that every repeating decimal is a rational number. 


The irrationality of e” follows the same pattern of proof as that of z. Can 
you use a similar method to show that 2” is irrational? How about x? for 
context. 


In the proof of the irrationality of 2, why won’t x(1 - x)/n! work in 
place of x"(1-x)"/n!? 


Let f(x) € Z[x] and consider h(x) = x" f(x)/n!. Show that the hY) (0) 
are always integers. Prove also that n+ 1 divides AY) (0) for j +n. What 
happens at j =n? 


In the proof of the transcendence of e why can’t 


xP" (x -1)--(x-n)]? 
(p-1)! 


117. 
118. 


119. 
120. 


121. 


122. 
123. 


124. 
125. 


126. 


127. 


128. 


129. 


130. 


131. 


Dialing In Set 6 


be replaced simply by 


[x(x- 1)--(x- a)? , 
p=) , 


Let Z = cos a +isin a Compute the minimal polynomial for ¢ + ¢~!. 


Show that a + bi is algebraic over Q if and only if a and b are algebraic 
over Q. 


Find the number of representations of 119 as the sum of two squares. 


Using the theorem of Weierstrass—Lindemann (Theorem 5.2), show that 
log a is transcendental for a a nonzero real algebraic number, a # 1. 


Show that under the assumptions of problem 120, sin a is transcendental 
for a nonzero real algebraic. 


Show that if @ is transcendental, then so is a’ for r + 0 and rational. 


Show that the set of all algebraic numbers in C is an algebraically closed 
field. 


Find the minimal polynomial for V7 + V/11 in Z[x]. 

Use the proof of the transcendence of e (Theorem 5.5) to show that e 
does not satisfy a relation of the type ae? + be +c = 0 with a, b,c rational. 
Show that if f and g are n-times differentiable real-valued functions, 
then 


n 


(ray = $ (1) 10-60, 


k=0 


where f”) denotes the mth derivative of f. 


Prove that f(s) is always an integer for s = 1,2,...,n, where 
xP N(x -1)(4-2)--(x- a)? 
f(x) = 
(p-1)! 
Show that if E > F are fields and E is a one-dimensional vector space 


over F, then E = F. 


Use the method of proof for the irrationality of e to exhibit other irra- 
tional numbers. 


Can you use the method of proof for the irrationality of e to show that 


1 1 1 1 


1+=+ + tenet to 
2 (2-3)? (2-3-5)3 (2-3-pn)” 


is irrational, where p,, is the nth prime? 


By considering ie “tan” x dx, show that 


iii oo. 254 
2 3 4 


The restriction to “real” 
is inessential in problems 
120 and 121. 


This proof is due to 
McKay [55]. 


Liouville’s theorem: 
Theorem 5.1. 


132. 


133. 


134. 
135. 


136. 


137. 


138. 


139. 
140. 


141. 


142. 


143. 


144. 


145. 
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Reduce the proof of the transcendence of e to the case n = 1. This gives 
a proof of the irrationality of e that does not depend on the series for e. 


Suppose a finite group satisfies the condition that x“ = 1 has exactly d 
solutions for every d | n. Show that the group is cyclic. 


Use the pentagon game of Section 2.2 to find a fifth root of unity in Z4). 


Show that z does not satisfy a quadratic equation with rational coeffi- 
cients. 


(For those who have done Dialing In problem 53, a really important 
problem.) Let G be a finite group and let p divide |G|, p prime. Put 


S={(a,...,@p)|a + d2"-Ap = e, dj eG} 


and let Z, = {l,o,..., o?~|\ be acyclic group of order p that operates 
on G bya (ai, iN dp) = (ap, a1, a2, - ogi): Show that this is a good 
action on S, and by counting orbits, show that there is an element of 
order p in G (Cauchy’s theorem). 


Let p = 3 (mod 4) and consider Z, (/-1). Show that conjugation is 
the pth-power map. 

Using the mean value theorem, get an explicit estimate for the constant 
in Liouville’s theorem on approximation of algebraic numbers. 


Can you get an explicit proof of Liouville for \/2? 


(For those who have done Dialing In problem 53, still a really important 
problem.) Let G operate as a transformation group on a set S. If s € S, 
let J, be the subgroup of all g € G such that g(s) = s. Show that the 
number of cosets of J; is equal to the number of distinct elements in the 
orbit G(s) of s. 


Let F be a field and a, 8, y € F. Suppose that a + B+ y = 0. Show that 
a + B+? = 3afy. 
Let n be a positive integer. Consider 

Yn(x) = I] (x - ), 


(j,n)=1 
l<j<n 


where ¢ = e?7'/" = cos 22 + jsin =. Show that ¥,,(x) ¢ Z[x]. 


A subgroup H of a group G is called normal if aH = Ha for all a € G. 
Let G be a group of order p” and H a subgroup of order p”"!. Using 
orbits and counting, find a result concerning the normality of H. 


1 x y Zz 
2. 0’ 2 
x“ yz 
t 
Compute ie ff a 
1 x4 i z4 


: eae 22 43 cin 22 Bi va: : -1 
Consider ¢);=cos 77 +i sin 77. Find the minimal polynomial for ¢11+¢); - 


147. 
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. Show that if m > 1, m an integer, then |'¥,,(m)| > (m—-1)*, where s is 


the number of positive integers less than n and relatively prime to n. 


Fill in the blanks: The of iS or 


1.7 Dialing In Set 7 


Fourier series are added to the mix of Dialing Ins from the first five chapters. 
The formula in problem 151 might seem quite mysterious, but the mystery 
will be solved in Chapter 6. The result of problem 158 might seem obvious, 
but how can you prove it? 


148. 


149. 
150. 


151. 
152. 


153. 


154. 


155. 


156. 


157. 
158. 
159. 


160. 


161. 


Show that the Galois group of Q(é,) over Q is a cyclic group of order 
p — 1. Here p is a prime larger than 2. 


Calculate the Fourier series for f(x) = x? on [-z, z]. 


Calculate the Fourier series for f(x) = cos 5 on [-z, 7]. 


Find a Fourier series proof of us =1+ + + L tees, 


Find the Fourier series for cos x if cos is rendered an odd function on 
[-7, +7] by defining 


COS Xx, O<x<z, 
ro) +{ 


—cosx, -m™<x<0Q. 


(For those who have done Dialing In 53 and related problems.) If G is 
a group, then the center of G is the set of all x such that xy = yx for 
all y € G. Show that the center Z(G) is a subgroup and that if G is a 
p-group (that is, |G| = p” for a prime p), then Z(G) # {e}. 

Let p and q be distinct primes. What can you say about Q(¢,, ¢,)? Can 
you find € such that Q(¢,, Z,) = Q(é)? If not, why not? If so, why? 


Calculate the Fourier series for f(x) = ie —HSXS<T. 


Let F be a field with p” elements, p prime. Show that 0 : F > F defined 
by a(x) = x? is an automorphism. Show also that the fixed field of o is 
(isomorphic to) Zp. 

Show that f,° x"~!e~* dx = (n- 1)! for n an integer, n > 1. 

Prove that there is no integer x such that 0 < x < 1. 


Write out a careful proof that the Fourier series averages at jump discon- 
tinuities with finite left slope and right slope. 


Compute te e* dz, where I is the unit circle traversed counterclockwise. 


Compute also f;.z” dz, n > 1. Finally, be sure to calculate t dz. 


Consider the curve defined by y = \/|x| on [-z, a]. The left and right 
derivatives at 0 are oo. Show that nevertheless, Dirichlet’s argument can 
be modified to give the Fourier series at 0. 


11 


,, is defined above in 
problem 142. 


Calculating Fourier series 
for specific functions 
builds algebraic muscle. 


Hint: Sketch a graph. 


But f # g in Z[x]. How 
can this happen? 


162. 


163. 
164. 
165. 


166. 


167. 
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Using 
i(n+1)x 
Lt el 4 e2* 4.2.4 em = ge eat 
ex] ” 
show that 
1 sin(n + 4)x 
= +COSX + cCOs2x + +++ + cosnx = —————— 
2 2 sin 5 


Show other things too. 


Show that ie x* dx =1+ + + 7 + zt + * +++, Hint: See problem 157. 
Omitted. 


Show that x? + y? = —1 always has a solution for x, y in Zp. Hint: See 
problem 164. 


Let f(x) = x° +x? +2x and g(x) = x? + 3x. Show that f(a) = g(a) for 
all a in Zs. 


Show that (x + y)P" = xP" + yP" in Zp[x, y]. 


1.8 Dialing In Set 8 


This last set contains more variations on the recurring themes that run through 
the book, developing some especially nice identities. The one in problem 172 
shows up in many texts (precollege and undergraduate) without proof, joining 
the long list of identities that are part of the folklore of mathematics. So too 
with problems 181 and 194. Chapter 6, with its results and methods of Fourier 
series, will give you the tools to prove identities like this and more exotic ones 
like the stunning formula in problem 180. 


168. 


169. 
170. 


171. 


172. 


173. 


Write down a group of order 6561 every nonidentity element of which 
has order 3. 


Left out due to lack of ideas. 


Let E > Zp be fields and assume that E is finite-dimensional over Zp. 
Consider the mapping from E into E defined by g(x) = x?. Show that 
y is onto E. 


Show that for x € [0,7], 


2 


1 1 cos3x  cos5x 
— — — =cosx+ + + 
8 4 32 52 
Prove that 
1 1 1 1 1 1 1 1 
1+ + + toe, 
3 5 7 11 13 #17 19 23 


Calculate the units in the ring Z [Vv 5], which comprises all a + bV-S, 
a,be Z. 


179. 


180. 


181. 


182. 


183. 
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. Show that Lemma 3.11 in Section 3.2 breaks down for Z [Vv =5] and find 


a counterexample to Lemma 3.16 for Z [Vv -5]. 


. Let p and q be distinct primes. When is ¢, ¢ Q(¢,)? Here, of course, 


pF2. 


. Find the minimal polynomial for &,2 for an arbitrary prime p. 


. Find the nonprime that differs by 2 from the 41st prime. 


. In Z[x], let 
f (x) =ayx" + Gyn +e ag, 
g(x) = byx™” + Dip | oe Dp: 
Suppose that a,, Gn—1,...,a9 have no common factors except +1, and 
by, bn-1,--+, bo, have no common factors except +1. Show that the same 


is true of the coefficients of f(x) - g(x). 


Take It Further. Consider the ring Z[x]. Show that the polynomial 


zee xt gt yg ghz 2 4... 4 x12 + x is irmeducible in (Z[x]) [z]. 


Take It Further. Show that for x € (—7, 7), one has 


, sinnx 


P20 ey spy 
n=l n n=1 n 


Conclude that sin x — abst 4. Sin 3x 


of degree 3 in R[x]. 


—-++- 1s (on (—7, 2)) a polynomial in x 


Prove (at least formally) that 


— 1. 153-3? 23-5: a! 
sin x=x+-- + + + -= te 
23 2. 2-4. 


Letting x = 1, we have 


1 1 13 1 1-3-5 1 
2 2:3 2-4 5 2-4-6 7 


> 


a positive series for 7. 


Making cos odd on (—2, z) by defining 


COS Xx, O<x<z, 
100) -{ 
—cosx, -m™<x<0O, 
show that 
8 S nsin2nx 
COS xX = Fo ae-1, 
for x € (0,7). 


Show that every group of order p”, where p is prime, is abelian. 


For this reason, Zp is 
called perfect. It truly is 
perfect. 


184. 


185. 


186. 
187. 


188. 
189. 


190. 


191. 
192. 


193. 
194, 
195. 


196. 


197. 
198. 
199. 
200. 
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Let f(x) be irreducible of degree n in Z,,[x]. Suppose f(x) = (x - a1) 
---(x-@y), where a; € E > Z,. Find the polynomial whose roots are the 
pth powers of the roots a1,..., Qn. 

Consider E > F, where E, F are fields. If a € E is algebraic over F and 
o is an automorphism of F leaving F elementwise fixed, then show that 
a and a (a) have the same minimal polynomial. 

Relate problem 185 to problem 184. 


Let f(x) be irreducible in Z, [x]. Show that f(x) cannot have a repeated 
root in any extension field of Z,. 
Find the Fourier series for sin>(x). 


x 1 1 
6! at" 2x2 


COS X 


x4 7 


Calculate lim ( = ) if the limit exists. 


Calculate lim —_ d x |. 
x 30 \ * 2 sin 


Evaluate f° “** - cos.x dx. Hint: The answer is 7/2. 
Let G be a group of order 2p, where p is prime. Is G abelian? How many 
nonisomorphic groups are there of order 2p? 


Calculate the Fourier series on [—7, 7] for e*. 


4 
Show that 7, =1+4+4+3+-. 
Establish the following trig identities: 

cos 26 = 2cos” 6 — 1, 

cos 30 = 4cos* 6 — 3cos@, 

cos 46 = 8 cost @ — 8cos” +1. 


Prove that if x ¢ 7 -Z, then 


ee 
; : sin” nx 
sinx + sin3x +---+sin(2n- 1)x = . 


sin x 


: F si 2 nt _ an 
Using problem 196, show that fy” “y"* dt = 7. 


Show that 5 + 7 + 7 +-+++ + is never an integer. 
n 


Consider F = Q(W2,i) = Q(/2)(i). Calculate the Galois group of F/Q. 
Let G be a finite group of order n = pm, p + m. Show that if H; and 
Hp are subgroups of order p, then there is an element @ € G such that 
aH\a"! = Ab. 


Polygons and Modular 
Arithmetic 


There are connections between algebra and geometry that go well beyond the 
function—graph—analytic-geometry connections studied in high school. 

We will use the field of complex numbers to tie together the geometry of 
regular polygons and the algebra of polynomials. As a bonus, we will meet 
some number theory and a little group theory, all bound up in a delightful 
package. Here we go... 


2.1 The Complex Numbers 


One of the richest mathematical structures is the field of complex numbers. 
With its wonderful balance of algebraic, analytic, and topological properties, 
it has played a major role in the development of classical and modern mathe- 
matics. 

But as late as the nineteenth century, the existential status of this field was 
unclear, resulting in the unfortunate but colorful adjective “imaginary.” One 
reason for this is that unlike the folklore prevalent in most school algebra 
texts, complex numbers originally appeared not as attempts to adjoin V/—I to 
the real numbers, but as devices that were used in algorithms that produced 
solutions to cubic equations with real coefficients and real solutions. These 
“imaginaries” occurred in the algorithms at certain points but canceled out in 
the end. 

It took more than two centuries before the reification of complex numbers 
as pairs of real numbers came into common usage. This topological visual- 
ization as the Cartesian plane by Carl Friedrich Gauss (1777-1855) and Jean- 
Robert Argand (1768-1822) already suggested a rigorous definition, and the 
realization that the correspondence a + bi < (a,b) admitted both algebraic 
and geometric interpretation was a breakthrough in mathematics. The flood- 
gates surrounding the idea soon broke open. 

But in point of fact, simply the consideration of expressions like a + bi, 
where i* = —1, gives a perfectly valid algebraic construction. This point 
of view was generally accepted by Leopold Kronecker’s (1823-1896) time. 
The definition of the field of complex numbers as ordered pairs with the 
desired multiplicative and additive structure had already appeared in Hein- 
rich Weber’s Lehrbuch der Algebra. James Pierpont, who reviewed the first 
edition, commented: 


© Springer Nature Switzerland AG 2023 15 
K. Ireland and A. Cuoco, Excursions in Number Theory, Algebra, and Analysis, 
Readings in Mathematics, https://doi.org/10.1007/978-3-03 1-13017-5_2 


®) 


Check for 
updates 


Those high-school con- 
nections are important 
too. 


In 1797, Caspar Wessel 
presented a paper to the 
Royal Danish Academy 
of Sciences entitled “On 
the Analytic Represen- 
tation of Direction: An 
Attempt.” Largely unno- 
ticed, it contained the 
essence of the geometric 
correspondence. 
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Leopold Kronecker 

gave a procedure for 
constructing a general 
field that contains all the 
roots of a polynomial with 
coefficients in that field, a 
construction that applies 
to C. More about this in 
Chapter 4. 


Section 2.3 develops 
another way to think about 
C using modular arith- 
metic for polynomials, 
very much in the spirit of 
Kronecker’s formulation. 


Chapter 2 Polygons and Modular Arithmetic 


In so small a space as this, the complex numbers and the four arithmeti- 
cal operations upon it are defined. Of the mystery that once surrounded 
this number, not an atom is left by such a treatment; fractions and irra- 
tional numbers, negative and complex numbers, all stand on the same 
footing; all are equally real or unreal. 


Thus the romance of the imaginary was replaced by the romance of abstract 
construction. 

Throughout this book, we will denote by R the field of real numbers, and 
we will assume that you are familiar with their structure. For a quick and 
elegant review, consult the first several chapters of Jean Dieudonné’s Foun- 
dations of Modern Analysis [22]. 


By C we will denote the field of complex numbers. Recall that this field 
is conveniently defined as the set R x R with the following addition and mul- 
tiplication: 


(a,b) + (cd) =(a+c,b+d), 
(a b)- (c,d) = (ac — bd, bc + ad). 


You should quickly verify that these definitions indeed impose the struc- 
ture of a field (a general set with two operations satisfying all the ordinary 
high-school rules of associativity and commutativity for both operations, 
inverses for the nonzero elements, and the distributive laws) on R x R, where 
the additive identity is (0,0) and the multiplicative identity is (1,0). If we 
identify (a,0) with the number a € R, then R becomes a subfield of C. Now 
by definition, 


(0,1) - (0,1) = (-1,0). 


So on putting (0,1) = i, we have, using the above identification, i* = —1, 
which was what we wanted in the first place. Furthermore, every element of 
C is uniquely represented in the form a + bi, where a and b are in R, which 
is rephrased by saying that C is a two-dimensional vector space over R with 
basis 1 and i. 

We can therefore represent the set of complex numbers C in Cartesian 
coordinates, where the horizontal axis represents the real numbers a, the ver- 
tical axis represents the complex numbers bi for real b, and the point (a, b) 
represents the number a + bi. We can now say that a complex number lies in 
the complex plane. 

Observe that the multiplicative inverse of a + bi # 0 is 


a-bi 
a+b?’ 


the existence of which is ensured by the fact that a? + b* = 0 if and only if 
a = b = 0. This is equivalent to the fact that —1 is not a square in R, which 
was the original observation of the subject. 

Not only is —1 not a square, it is not even the sum of squares in R. A great 
deal of interesting mathematics has arisen out of generalizing this notion to a 
general field. 


2.1 The Complex Numbers 


Lookout Point 2.1. A field F is called formally real if —1 is not the sum of 
squares. If it is the sum of squares, then we call the minimum positive integer 
s(F) for which -1 is the sum of s(7’) squares the /evel of the field. We shall 
see later in connection with our study of modular arithmetic that the integers 
modulo p have level one if 4 divides p— 1. A very beautiful result was proved 
by Albrecht Pfister in 1965 [62]. He showed that the level of a field is always 
a power of 2. And furthermore, given any power of 2, say 2”, there is a field 
with that level. Pfister has proved many other exciting results in the modern 
theory of quadratic forms and questions related to Hilbert’s 17th problem. 


Returning to the complex numbers C, there is a very important operation 
largely responsible for much of C’s success. It is called conjugation and is 
defined by Zz = a — bi, where z = a+ bi. Geometrically, conjugation is a 
reflection in the x-axis, and algebraically, it is an automorphism over R of 
order 2. These statements are codified in a theorem: 


Theorem 2.1. /f z and w are complex numbers, then the following relations 
hold: 
Gi) Z#w=Z+W, 
(ii) ZW =ZW, 
(iii) Zz = zifand only ifzeR 
(iv) zZER, 
(v) Z=z. 
It follows that the map z +> Z is one-to-one and onto. 

The real number zZ is called the norm of z and is denoted by N(z). If 
z= atbi, then N(z) = zZ = a’+b”, and since \/(a2 + b) is the distance from 
the origin to the point z in the complex plane, we call \/N(z) the absolute 
value or modulus of z and denote it by |z|. The norm inherits most of its 


properties from the above theorem. One that will be very important in what 
follows is that the norm function is multiplicative. 


Corollary 2.2. If z and w are complex numbers, then 
N(zw) = N(z) N(w) 


In particular, N (27) = (N(z))’. 


Lookout Point 2.2. If z = (a+ bi) and w = (c + di), the result of Corol- 
lary 2.2, when written out in all its glory, becomes 


(a + b’) Ge + d’) = (ac — bd)? + (be +d)’. 


Establishing this identity shows up in some high-school texts as an exercise 
(try it). The multiplicativity of the norm shows where it comes from. Alge- 
braically, we see that the product of the two “quadratic forms” on the left is a 
sum of two squares of bilinear forms in all the variables. 
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See Section 3.5 for more 
on Hilbert’s 17th problem. 


Property (iii) is really 
important—it character- 
izes R as a subfield of C. 
We will use it often. 


See Exercise 2.2 for the 
meanings of “one-to-one” 
and “onto.” 


If z = a+ bi, then 
N(z) = a? +b’. 
Pythagoras, anyone? 
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The 3-sphere is the set 
of all (a, b,c, d) with 
+b +e+d =1. 


... another old chestnut 
from high school. 


For a basic and complete 
development of elemen- 
tary trigonometry, see the 
notes written by the late 
Dick Askey at http://go. 
edc.org/askey- trig-2021. 


In addition to expediency 
and elegance, this devel- 
opment inverts the usual 
path (the usual path: from 
geometry of the unit circle 
to the algebra of power 
series). It is probably not 
a good way to introduce 
trig, but it is a very elegant 
example of the power of 
old-fashioned algebra. 
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In general, Adolf Hurwitz [40] showed that the identity 


(xp +e 2 


+xn) (op tet yn) = 2p te tla 


where the z; are bilinear expressions in the x’s and y’s, has a solution only 
for n = 1,2,4,8. 


The set of complex numbers with norm | is called the unit circle. We 
denote it by S'. Hence 


§ =1z|Niz) si} ={2| ga x+y, x ey" = 1}. 


Topologists call this a 1-sphere. (What is the common name for a 2-sphere?) 
Notice that S' is a multiplicative group. It has lots of points with rational 
coordinates. In fact, the points 


2r F 1-77). 
—, ] 
14+72 14+72 


are on S! for every real t (check this). Notice that we are doing number theory 
again, because the homogenized identity of this reads 


(2xy)? + (x2=y?)? = (x2 +9"), 


and that says that there are infinitely many (primitive) Pythagorean triples, 


i.e., triples (a, b,c) of coprime integers with a” + b* = c?. 


In what follows, we will need the existence and basic properties of the 
trigonometric functions. It would be a long analytic digression, equivalent 
roughly to one semester of elementary analysis, to develop them thoroughly. 
Let us stabilize the situation by defining them and listing the properties we 
need using algebra. Note carefully the way in which z sneaks into the act, 
because later, when we prove that z is irrational, the definition chosen here 
will be important. 

We define sin x and cos x as two functions of a real variable x given by the 
following formulas: 


er x 


+. 
7! 


eR Gg 8 


earn 
8! 


sinx =x 


> 


(2.1) 


cosx =1 


The geometric motivation for this definition comes from the desire to find 
functions satisfying the differential equation y” = —y. Check that using these 
definitions, if y = sin x, then y” = —y. Is the same true for y = cos x? 

You can show (try it) that sine and cosine defined in this way satisfy the 
following functional equations: 


sin(x + y) = sinxcos y + sin ycos x, 
(x+y) rai yee (2.2) 
cos(x + y) = cosxcos y -sinxsiny. 


2.1. The Complex Numbers 


It follows (with some work) that 


2 


sin? x + cos” x = 1, 


showing that these two functions are bounded in absolute value by 1. The 
cosine function is positive at x = 0, since cos0 = 1. If the cosine were 
positive everywhere, then its second derivative, — cos x, would be negative 
everywhere. But that is incompatible with a bounded continuous function. 
It follows that the cosine function must have zeros, and it must have them 
at positive and negative values of x (why?). Let 7 denote the first positive 
number such that cos 7 = 0. That is, 


cosx>Oon [0,7) and cosy7=0. 


From this, using equations (2.2), one derives the periodicity of the sine and 
cosine: 


sin(x + 47) =sinx, cos(x +47) =cosx. 


We see that as ¢ goes from 0 to 477, the complex number cost +i sinf traverses 
the unit circle S! starting at (1,0) in the Cartesian plane, in a counterclock- 
wise manner, exactly once. 

Putting x = cost and y = sin¢ and using the formula for arc length, we see 
that the arc on S! from (1,0) to (cos¢,sin?) in the counterclockwise sense 
has length t for 0 < t < 4, as illustrated in Figure 2.1. 


(cos t,sint) 


(0,0) (1,0) 


Figure 2.1. The arc defined by (cost, sinr). 


The unit circle, of course, has length 27, so 27 = 4n, whence n = 7/2, 
the period of the sine and cosine functions is 27, cos (4 - mn) = 0 for all 
integers n, and everyone is happy. 


One of the most important results in mathematics is the famous theorem of 
Abraham de Moivre (1667-1754), who is now well recognized as an unrec- 
ognized genius. He knew Isaac Newton and Alexander Pope. The result is 
simply this. 


Theorem 2.3 (De Moivre). For every integer n, 


(cos x +isinx)” =cosnx +isinnx. 


The work is yours in 
Exercise 2.9 
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Proof. First, we can assume n to be positive. (Why?) The base case: if n = 
1, we are through. (Why?) The inductive step is just the addition formulas: 
Assume that the theorem is true for n — 1. Then 


(cos x +isinx)" = (cosx+isinx)(cosx +isinx)""', 
which by the induction hypothesis equals 
(cos x +isinx)(cos(n - 1)x +isin(n- 1)x) 
=cosxcos(n — 1)x — sin x sin(n - 1)x 
+ i(sin x cos(n — 1)x + cos.x sin(n — 1)x) 
=cosnx+isinnx. So there. 
a 


This result is really useful, for it enables us to solve lots of important equa- 
tions. The most important equation for us is 


= 1% (2.3) 


It is not a priori clear at all that this has any roots besides x = 1. However, 
consider the complex number 


2n |. 2 
fn = COS +i sin 
n n 


Then 


2 2n\" 
hie (cos 7 + isin "| =cos2a+isin2z = 1. (2.4) 
n n 


Hence ¢, is a root of (2.3). You can also see this by noting, as you will see 
in Lookout Point 2.4, that to multiply two complex numbers, just add their 
angles and multiply their absolute values. 

It follows that 


L fe 2 oy 


are the n distinct roots of (2.3), and they give the vertices of a regular polygon 
with n sides, as shown in Figure 2.2. 


Figure 2.2. The powers of fn. 


This is a remarkable fact. Let us restate it as follows. 


2.1. The Complex Numbers 


Theorem 2.4. The solutions to the equation x" = 1 form a multiplicative 
cyclic group of order n. 


Lookout Point 2.3. Here is an interesting fact. Are there any other finite 
multiplicative subgroups of the complex numbers? Suppose G is a subgroup 
of C* = C - {0} with n elements. Since G is a group, we see that x” = 1 for 
each x € G, by elementary group theory. In this situation, trigonometry plays 
the role of establishing an existence theorem. Later, we shall see that if a field 
K has a finite multiplicative subgroup G, then that group G must necessarily 
be cyclic; that is, there is an element p € K such that G = { GPP ace ie 
Hence G is an n-sided regular polygon. This observation plays an impor- 
tant role in establishing arithmetic analogues of some ruler-and-compass con- 
structions in plane geometry (stay tuned). 


Now, since 1, Z,, e, ae es are n roots of x” —1, and since x” — 1 can 
have at most n roots, we have by elementary algebra (writing ¢ for @,,), 


x" —1=(x-1)(x-2)(x CO) (x-c7!). 


However, it is straightforward to see (in several different ways) that 


n 
x" -1 = 
1 2 n-1 


a oe an 


(2.5) 


x- 
so that 
Lt xt ye peee gt yt! =(x-£)(x-27)--(x-¢"") ; 


This is a very nice formula, and we will have an opportunity to refer to it 
again. 
Putting x = 1, we have 


n=(1-¢)(1-¢7)--(1-e""), 


which decomposes n into the product of n — | complex numbers. 
Now 


xn .. 27 
¢ =cos +isin—, 
n n 
so 
=] 2n or 2n 
c cos isin 
n n 
Hence 


Cpe eg (2.6) 
n 


This will be useful when we look at constructibility of regular polygons in 
Section 2.2. 
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These terms are all defined 
in Section 2.5. 


See Sections 2.5 and 2.6 
for more detail. And 
Joseph Rotman’s An 
Introduction to the Theory 
of Groups [70] is a good 
reference for group theory. 


What are the finite 
multiplicative subgroups 
of the field of real 
numbers? 


This formula implies the 
formula for the sum of a 
geometric series. This is 
Exercise 2.10. 
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One more thing: notice that 


t Qn _ 2 1 
= Cos isin =- 
62 5 5) 


and 


¢ Qn Qs; 
= COS 7sin =l. 
a 4 4 


Lookout Point 2.4. When we derived equation (2.4) in Section 2.1, we 
noted that to multiply two complex numbers, you just add their angles and 
multiply their absolute values. The typical way to develop the “add their 
angles” piece of this is to wait for the addition formulas for sine and cosine 
(equations (2.1) in Section 2.1). Which is why many texts punt when they 
get to “multiply the lengths and add the angles” in classes before trig, usu- 
ally appealing to experiments or other kinds of motivation. But teachers at the 
Park City Teacher Leadership Program [82] (re)discovered a proof that uses 
nothing more than similar triangles. A detailed development of this proof can 
be found in [19, Chapter 3]. 

This may seem like small potatoes to many (“who cares what comes before 
what?’’), but it has curricular implications: 


¢ Students can understand the addition formulas before any advanced trig. 


¢ More importantly, one can use the geometry of multiplication to prove the 
addition formulas. This saves a great deal of class time and simplifies the 
whole arc of results. 


And it allows one to use complex numbers to derive trig identities (some- 
thing that used to be considered circular reasoning). 
For example, to get a formula for cos (4 + 0), calculate like this: 


Tv : Tt a T eas 
cos(* +0) +isin( +0) = (cos +i sin ) (cos + ising) 
4 4 4 4 
1 1 
= —= (1 +1) (cos@+isin@) = —=((cos@-sin@) +i(cos@+ sin@) ). 
a! )( ) 5 (( ) +i( )) 


Hence 


(F+0) + (is0 22nd) 
cos = cos sin : 
4 2 


oT 


and as a bonus, 


sin(# +6) ae (cos 6 + sin) 
4 af 2 : 


Exercises 


2.1 Prove Theorem 2.1. 
2.2 Show that the map z + Z satisfies two properties: 


2.1. The Complex Numbers 


(i) it is one to one: if Zz = w, then z = w; 


(ii) if z € C, then z = w for some w € C. 


2.3 Show that the norm function is multiplicative. 


2.4 Take It Further. If f is a polynomial with complex coefficients, define 


f to be the polynomial you get by replacing each coefficient in f by its 
conjugate. Show that 


Oofre—f +e. 
(Gi) fg = FB. 
(iii) f = f if and only if f(x) has real coefficients. 
(iv) f f has real coefficients. 
(v) If eC, then f(z) = f (2). 
2.5 Let f be a polynomial with coefficients in C. Ifa complex number z is a 
root of f, show that Zz is a root of f. 


2.6 Suppose f is a polynomial with coefficients in C. Let g = ff. Show that 
if g(z) =0, then either f(z) = 0 or f(Z) = 0. 


2.7 Show that S! has the structure of a multiplicative group. 


2.8 Let ¢ be a line in the plane that passes through —1 + Oi with rational 
slope t. 


(i) If € intersects S! in another point, show that this point’s coordinates 
are rational. 


(ii) In fact, find an expression in terms of t for the second intersection 
point. 


2.9 Using our definitions of sine and cosine, show that 
sin? x +cos*x =1. 
2.10 


(i) Establish the algebraic identity 


(ii) Use it to derive the formula for the sum of a geometric series. 


2.11 Generalize the “very nice formula” (2.5) to show that if n is a nonneg- 
ative integer and ¢ = ¢,, then the following hold: 


(i) If x and y are integers, then 


x" —y" = (x-—y)(x - fy) (x- Ly) (x- 6" 'y) - 


Recall that S! is the set 
of complex numbers of 
norm |. 


“Our definitions” are 
equations (2.1). 
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“Ruler” here refers to a 
straightedge—a ruler with 
no markings. 


This follows from de 
Moivre’s theorem. 


Two whole numbers a and 
b are said to be “equal 
modulo m” or “congruent 
modulo m’” if m divides 
a — b, in which case a 
and b are the same up to 
(modulo) a multiple of m. 
One writes, with Gauss, 

a =b mod mor simply 
az=b(m). 
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(ii) If x and y are integers and n is odd, then 


xt yha(xty(xt by) (x4 oy) (xt only). 


2.12 Take It Further. Using the power series definitions of sine and cosine, 
prove all the statements made in this section. Prove other things too. 


2.2 The Pentagon, Gauss, and Kronecker 


The pentagon is the first really interesting polygon. We will see how the use 
of complex numbers leads to a proof that the pentagon can be constructed by 
ruler and compass. An examination of the proof also leads to some interesting 
arithmetic questions that will enable us to explore a little modular arithmetic. 

Consider the five-sided regular polygon inscribed in the unit circle in the 
complex plane, as illustrated in Figure 2.3: 


g 


a 


Figure 2.3. The regular unit pentagon. 


2a 
5 


eo. en ee ee 


Here, ¢ = cos 


+isin “a and £> = 1. It follows that 


Already we see the modular arithmetic that we will further develop in 
Section 2.3, because the exponents are always “congruent modulo 5.” Thus 
C7! = ¢ and ¢-’ = £3. For example, 


= dai - ae + aad - ae ry = ae 
Notice that 
a =2eos =, (2.7) 


which is twice the real part of the first vertex, as previewed in equation (2.6) 
in Section 2.1. So to get the value of the side length of our pentagon, we must 
first find a nice expression for ¢ + ¢7!. 

For that, we use the great idea of squaring ¢ + ¢~', which gives 


(ery =P ater”. 


2.2 The Pentagon, Gauss, and Kronecker 


Now, ¢ is a solution to 


eo = 7 
2-1 °° 
but by Exercise 2.10, we have 
5 
-1 
ts =léx+x? 4x73 4x4, (2.8) 
Poe 
so we have 
14404040 =0. (2.9) 


Rewrite this as 


1+04+0°4+07°4+0'=0, 


or 

a a eS 
or 

c+") =1-(¢407), 
or 


(or°y 4(240-)-120: 


Well, this is wonderful! We have a quadratic equation in ¢ + ¢~!. 
Solving this equation gives 


-_-1+V5 
Se ee ae 


and there we are: we have an expression for cos 2m that involves only rational 
numbers and 5. 

We can play with this. Here we have also found the golden mean. Consider 
the segment of length 1, divided into two parts. Find x such that the length 
of the larger segment is the geometric mean of the length of the whole (that 
is, 1) and the length of the smaller segment, that is, such that 


l-x x 
x 1° 
See Figure 2.4. 
1-x x 
0 1 


Figure 2.4. Find the point x such that a =7- 


This becomes 


or 


x7+x-1=0, 
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To see this, use the 
fact that (6+ ey = 
+246. 


The rest of the story is 
taken up in the exercises. 
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As Van Morrison said so 
well [56], “Too late to stop 
now.” 


Oo =f, and... 
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so that 


-14+V5 
po ee 


5 (our old friend). 


And there’s more: from 


x7+x-1=0, 
we see that 
x(1+x)=1, 
or 
1 1 1 
we = [o> 
Lae Lee Lege ae 


1+ —— 
] = a 


whose partial sums are the ratios of consecutive Fibonacci numbers! So it’s a 
small world. 

There are a few other observations to make about the above argument. We 
have seen that 


2 


1 _-1+V5 
gegt= 


or 
V5=142(640')=14 6404+ 040%, 


In other words, \/5 is the sum of square powers of a fifth root of unity. This 
remarkable fact was generalized by Gauss. He took an arbitrary prime number 
p and considered ¢,,, the first vertex of a p-sided polygon. Then he showed 
that if 4 | (p - 1), then 


VPHLH Cte el LO (2.10) 


This is not an elementary fact, and it took Gauss at least a year of work to 
prove it. We shall prove it in Chapter 6. 

The right-hand side is called a Gauss sum. It lies at the base of his proof 
that a regular polygon with a prime number p of sides can be constructed if p 
is of the form 2° + 1 for some t. Such primes are called Fermat primes. This 
result is the very first entry of Gauss’s diary for March 30, 1796, and he was 
quite proud of it, for it represented the first progress in the constructibility 
of regular polygons since classical antiquity. Although Gauss stated as well 
that for a regular polygon with a prime number of sides to be constructible, 
the prime had to be a Fermat prime, he did not provide a proof. A proof was 
given in 1837 by Pierre Wantzel (1814—1848). The next two primes to which 
his result applies are 17 and 257. 


2.2 The Pentagon, Gauss, and Kronecker 


2.2.1. A Theorem of Kronecker 


Another observation on the above results has to do with a deep result due to 
Kronecker. Since V/5 = 1 + 2(é5 + £2), we see that the field Q[V/5] of all 
expressions a + bV/5, where a and b are rational numbers, lies in the field 
Ql és] of all expressions ag + a1 £5 + ane + axle, where do, @1, a2, a3 are ratio- 
nal. More generally, it is true that Q[ Vd] lies in some Q[Z,] for some n (4d 
will do it). The fields Q[\/d] are special cases of what are known as abelian 
extensions of the rationals. The word “abelian” here refers to the fact that the 
set of automorphisms of Q{ Vd] forms an abelian group under the composi- 
tion of mappings. Indeed, the only automorphisms of Q[\/d] are the identity 
and the mapping that sends a + b\/d to a — b\/d, and these two operations 
form a group of order 2, which is, of course, abelian. You should verify this. 

Even more generally, let F be any subfield of the complex numbers that 
has a finite basis as a vector space over Q. Every element a of F must satisfy 
a polynomial equation with coefficients in Q, because 


1, a, a’, a, ..., a” 

must be Q-linearly dependent for some n. If every isomorphism of F into C 
sends F back to F (as in the above examples), then we say that F is a Galois 
extension of Q or simply that F is Galois. 

Now given a Galois extension F’, one can attach to it a very important finite 
group G called its Galois group. This group reflects in its structure much of 
the interaction between F and Q and is the object of many interesting inves- 
tigations in algebra. It is quite simple to define: it is the set of all automor- 
phisms of F, and the group operation is composition of functions. That is, 
an element of G is a mapping from F to F that is onto, one-to-one, and pre- 
serves all the algebraic operations. If such a mapping is denoted by o,, then 
the structure-preserving requirement means that 


o(a+b)=a(a)+oa(b) 
and 
ao(ab) =a(a)o(b). 


We say that the field F is abelian if the Galois group of F is abelian (that is, 
commutative). Now at last we can state Kronecker’s big result. Here it is: 


Every abelian extension field F of Q sits inside Q[Z, | for some n. 


This result is quite deep, and its generalizations form the object of much 
research. 
Another example: Consider 2g, the first vertex of the regular octagon situ- 
ated in the complex plane. Then i = —1, and so (putting 2g = 2), we have 
C+ | 


(C4o9Y = 0424072245 2. 
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Q denotes the field of 
rational numbers. Think 
“Q for quotient.” 


The fields Q[/d] and 
Q(5)? Stay tuned. 


Such fields are called 
algebraic number fields. 


For example, the Galois 
group of Q[/d] consists 
of the two automorphisms 
described above: the 
identity map and the map 
that sends a + b/d to 
a- b/d : 


Kronecker’s theorem 
is also known as the 
Kronecker—Weber 
theorem. Note that n 
depends on F. 
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Another simple example: 
i = &, so trivially, 
Qli] < Q[&4], because 
Qlé] = Q[éa]. 


Unnoticed, had it not been 
for the sidenote on that 
page. 


Z (zahlen) denotes the ring 
of ordinary integers. 


And there will be nothing 
special about 5 in the 
development. 


Are there any polyno- 
mials in Q[x] that have 
reciprocals in Q[ x]? 


This arithmetic is very 

similar to ordinary arith- 
metic with integers (this 
similarity is described in 
detail in [19, Chapter 6]). 


The motto is “formal 
polynomial identities 
are true under any 
substitution.” 
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Hence £+ £7! = \/2, and therefore, Qi V2] c Q[Zg]. Again we have a special 
case of Kronecker—Weber. 


Notice that in our examples, all the fields Q[ V-1], Q[V2], Q[V5] are 
two-dimensional over Q. Later, we will encounter a field F of dimension 3 
over Q, and again, as a byproduct of a deeper investigation on roots of unity, 
we will have another instance of Kronecker’s theorem. 


2.2.2 Some Properties of Algebraic Extensions 
of Fields 


In Section 2.2.1, we slipped in a comment that may have gone unnoticed— 
we referred to Q[ 5], the set of rational linear combinations powers of ¢5, as 
a field. Recall that this means that in Q[ 5], addition and multiplication are 
commutative and that all the usual rules of high-school algebra hold. It also 
means that the reciprocal of every nonzero element of Q[ 5] sits in Q/ 5]. 
Other familiar fields are Q, R, and C. But Z is not a field, because, for exam- 
ple, i is not in Z. 

In this section, we will prove that Q[ £5] is a field, and along the way, we 
will show that Q[¢;] is a vector space over Q with basis { Lee }. In 
the following paragraphs, we shall develop a few facts from field theory that 
cover these statements. 


A Little Field Theory 


We have been a little relaxed until now about distinctions that we should 
make explicit. If F is a field, we let F[x] denote the system of polynomials in 
x with coefficients in F together with the usual operations of high school— 
addition and multiplication. But F[x] is not a field, because the reciprocal of 
a polynomial is not, in general, another polynomial. In fact, F[x] is a ring, 
the ring of polynomials with coefficients in F 

Polynomials are formal objects, and as such, you can do arithmetic with 
them. You can also evaluate polynomials at real or complex numbers, so 
that each formal polynomial defines a polynomial function. This interplay 
between formal polynomials and polynomial functions—form and function— 
is a cornerstone of modern algebra. For example, you can factor x° — 1 using 
equation (2.8): 


xP -1=(x-1) (x4 +33 +97 4+441). 


This is a formal identity in Q/.x]. It is a statement about polynomials, not 
numbers, and you could prove it by, for example, multiplying out the right- 
hand side and watching all but two terms disappear. Because this is a formal 
identity, you can replace x by any number, and you will get a true statement 
about numbers (why?). So for example, on replacing x by 2, we get 


31=164+8+4+2+4+1. 


And putting x = ¢5, we get 
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0= (65-1) (+034 +65+1), 


providing another look at equation (2.9) from Section 2.2. 


Lookout Point 2.5. Suppose again that F is a field. If you take all the 
polynomials in F[x] and replace x by some number, say ¢5, you get a new 
system (this time consisting exclusively of numbers), which we can sugges- 
tively denote by F[¢]. So elements of F[£] are “polynomials in ¢.” Note, 
though, that two polynomials in F[x] can produce the same number in F[¢]. 
For example, x° — x + 1 and 2— x both produce 2 — ¢ when x is replaced by ¢ 
(check this). While F [x] does not contain the reciprocals of all of its nonzero 
elements, we will prove next that F[¢] does! It is not at all obvious that the 
reciprocal of a linear combination of powers of ¢ with coefficients in F is also 
a linear combination of powers of £. But it is true. For example, in Q[¢], we 
have 


1 
ae ae 


1 
aa (72° +967 +82 +3). 
Checking that 


(? - f° +2¢) (70° +927 +82 +3) =-11 


makes for a delightful calculation—try it! (This calculation didn’t drop out 
of the sky. You will see later that there is a general method that allows one to 
calculate reciprocals in Q|Z,,] using little more than high-school algebra and 
some arithmetic. We will take this up in the coming chapters.) 


Meanwhile, back at the ranch ... If E and F are fields and FE > F, we say 
that E is an extension field of F, and we make a “tower diagram” like this: 


E 


F 


An element a ¢€ E is said to be algebraic over F if there exists a nonzero 
polynomial f in F[x] for which f(a) = 0. For example, i, viewed as an 
element of C or Q(i), is algebraic over Q, since it satisfies the equation x7+1 = 
0. Similarly, ¢, is algebraic over Q, since € - 1 = 0. If a € E is algebraic 
over F, we can construct F[a] in the same way that we built F[Z] in the 
Lookout Point above: F[a] is the set of linear combinations of powers of a 
with coefficients in F. 

It is natural to seek, among all polynomials f in F[x] that admit @ as a 
root, polynomials of lowest degree. The next result shows that such a polyno- 
mial is essentially unique and that it is irreducible. 


Theorem 2.5. Let a be algebraic over a field F, and let f(x) be a polyno- 
mial in F |x] of minimal degree with f(a) = 0, normalized so that its leading 
coefficient is 1. Then: 
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Let’s use just plain & 
for f5. 


If you don’t get “Mean- 
while, back at the 
ranch...,” check out 
the Wikipedia article on 
the subject. 


In Chapter 5, we will 
prove that 7 is not 
algebraic over Q. 


A polynomial is irre- 
ducible in F(x] if it 
doesn’t factor into poly- 
nomials of lower positive 
degree. And a polynomial 
with leading coefficient 1 
is a monic polynomial. 
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This is the division 
algorithm for polynomials. 
See [41] or [19]. 


For more on the Euclidean 
algorithm in F [x] (and 
in Z), see Chapter 3, 

[19, Chapter 6], or [41, 
Chapter 1]. 
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(i) f(x) is the only monic polynomial with this property. 
(ii) f(x) is irreducible. 


Proof. (i) Suppose there were two monic polynomials of smallest degree that 
send a to 0. Their difference would also vanish at a, and the difference would 
have lower degree. This contradicts the minimality of degree of the original 
polynomial. 

(ii) If f(x) = g(x)h(x), f(a) = 0, and both g(x) and h(x) are of positive 
degree, then we must have either g(a) = 0 or h(a) = 0. But the (alleged) g 
and h would then have degree less than that of f. | 


Corollary 2.6. Let f(x) be the monic polynomial in F| x | of minimal degree 
with f(a) = 0. If g(x) € F[x] and g(a) =0, then f(x) divides g(x). 


Proof. Write 


g(x) = f(x)h(x) + r(x) with 0 < degr(x) < deg f(x). 


Then 


0= g(a) = f(@)h(a) +r(a) =r(a). 
By the minimality of the degree, we must have r(x) = 0. a 


It follows that the monic polynomial f € F[x] of minimal degree satis- 
fying f(@) = 0 is uniquely determined by a and F. It is called the minimal 
polynomial for a. Theorem 2.5 shows that the minimal polynomial is noth- 
ing more than the unique monic irreducible polynomial in F[x] that has a as 
a root. 


2.2.3 Now We Can Show That Q/Z,| Is a Field 


Up until now, Q[Z,] has meant the set of rational linear combinations of ,. 
We will now show that Q[Zé,,] contains the reciprocals of all its nonzero ele- 
ments. 


Theorem 2.7. Let a be algebraic over a field F, and let f be the minimal 
polynomial for a in F[x]. 


(i) If g(x) is a nonzero polynomial in F[x| and g(a) # 0, then Aa) is 


in F[a]. 
(ii) If f is the minimal polynomial for a, then Fa] is a vector space over F 
of dimension equal to the degree n of f with basis { Lio. }. 


Proof. (i) Note that f(x) does not divide g(x) (for g(a) #0). Since f(x) is 
irreducible, it follows that f(x) and g(x) are relatively prime. Now, just as 
in the case of Z, one can use the Euclidean algorithm in F'[x] to show that the 
greatest common divisor of two polynomials is an F[x |-linear combination of 
the two polynomials. In other words, one can find s(x) and t(x) in F[x] such 
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that s(x) f(x) +t(x)g(x) = 1. On substituting x=a, we see that t(@)g(a@)=1. 


So 1/g(@) =t(a@). 
(ii) Write 


f(x) =x" + Gn-ix” | + G_uax” ? +0494 Gx +aq. 
Since f(a) =0, we have 
@==a40 =he "= = 2 = a: 
Thus every linear combination of powers of a may be written (by repeated 


substitution) in the form 


Cot cia + c90* +--+ e,-10""'. 


It remains to show that l,a,a’,...,a”"! 
Suppose to the contrary that 


are linearly independent over F. 


bo + bia + boa? tere t b,-10” = 0, 
with not all the b; zero. Then by Corollary 2.6, f(x) must divide the polyno- 
mial 


bo + Dix + box? +++++ by. 


However, that is absurd, since f(x) has degree n and bo + byx + box? +--+ + 
by yx} has degree less than n. Hence we have established all assertions. 


Corollary 2.8. Let a be algebraic over a field F. Then F(a] is a field. That 


aa) MO 
{#2 


ee Fish a(o) +0} 


When a ring F[a] turns out to be a field, we usually indicate this by 
employing the notation F(a). 


Corollary 2.9. 
Q(f,) is a field; it contains the reciprocals of its nonzero elements. We may 
therefore write Q{f,]| as Q(,), the field of linear combinations of powers of 


on: 


2.2.4 A Criterion for Irreducibility 


This is just the beginning. One can go on and on, developing the entire theory 
of algebraic extensions. One of the early payoffs in such a program is Galois 
theory. However, let us limit ourselves to a clarification of the cyclotomic 
situation discussed earlier in Section 2.1. 

In general, it is difficult to establish the irreducibility of a given polynomial 
in Z[x]. One very useful criterion is due to Gotthold Eisenstein (1823-1852). 
It is based on a bit of modular arithmetic, which we will take up in more detail 
in Section 2.3. For now, here are some basic facts. 


Fla] = F(a) 


“cyclotomic” = “circle 
dividing.” 

There is a story that 
may be true. On being 
asked who he believed 
were the three greatest 
mathematicians of all 
time, Gauss answered, 
“Archimedes, Newton, 
and Eisenstein.” The 
correct answer is, of 
course, Archimedes, 
Newton, and Gauss. 
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You may have met 

these operations before. 
Sometimes, they are called 
the operations of “clock 
arithmetic.” Why? 


The shorthand for this is 
that Z7[x] is a unique 
factorization domain. 


7 is just a placeholder here 
for any prime. 
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Preview: A Little Modular Arithmetic 


Before we develop modular arithmetic systematically, let us look at a special 
case. Consider the prime number 7 and denote by Z7 the set of seven sym- 
bols {0, 1, 2, 3,4, 5,6}. These symbols are already familiar to you, but we can 
introduce new operations on them, new addition and new multiplication, so 
that we remain in Z7. For that, we use the following rule: multiply or add 
as like in the old days, but throw away sevens until you get back to Z7. For 
example, 5 x 6 = 2 (because 30 - 4 x 7 = 2), and 5+6 = 4. The complete 
addition and multiplication tables are given in Figure 2.5. 


+lflolil2|3l4]5]6 xflol1/2/3]4]/5]6 
offol1]2/3/4]516 Oflololololololo 
1i1/2/3/4/5|6]/0 1ol/1/213/41/5 1/6 
22/3/4lsleloli 2ol2l4l6lil3ls 
313/4/slelolil2 3llol3l6/2/5/1/4 
41.415/6/0/1|2173 4lfol4/1151/216]3 
sisl6lol1|2/3/4 5ifo015/31/116/4]2 
66lol1|2131/4|5 6lol6l5/4/3l211 


Figure 2.5. Addition and multiplication tables for Z7. 


An examination of the tables shows that Z7 is a field—in particular, every 
nonzero element has a multiplicative inverse. There are other properties that 
are not so evident. Take 2, for example, and begin raising it to various powers, 
1,2,27,23,.... One obtains 1, 2,4, 1,2, 4, 1,2,4,..., showing that { 1,2,4} isa 
(cyclic) subgroup of three elements. On the other hand, beginning with 3, we 
have { 1,3,37,3°,34,3° } = { 1,3,2,6,4,5 }, which is the whole set of nonzero 
elements in Z7. Because the number 3 generates the entire multiplicative sub- 
group of Zz, it is called a primitive element, or simply a primitive, modulo 7. 
Are there any other primitives modulo 7? 

Just as with any field, we can consider the set Z7[x] of formal polynomi- 
als in one variable with coefficients in Z7. And just as with polynomials in 
Q[x] or C[x], we can do arithmetic with elements of Z7[x]. This arithmetic 
supports addition and multiplication, and all the usual rules of algebra apply. 
In particular, it so happens that every polynomial in Z7[x] can be factored 
into irreducible polynomials in essentially one way. It takes a little time to get 
used to algebra in Z;[x], but you will get used to it with a little practice. For 
example, x7 — 1 = (x —1)(x+ 1), as always. But x? —1 = (x +.6)(x +1), too. 
(Why is this the same factorization?) Another example: x7—2 = (x+4)(x+3). 

One of the most useful properties of this setup is in the interaction between 
Zy[x] and Z[x]. If f is any polynomial in Z[x], you can get a correspond- 
ing polynomial f by “reducing the coefficients of f modulo 7.” This means 
replacing each coefficient in f with the remainder when that coefficient is 
divided by 7. So for example: 


(i) x3 + 14x? + 8x4+9=x7 +442; 
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(ii) x3 — 12x2 + 21x -10 = x3 +2x7 +4; 
(iii) x3 — 49x2 +21x-7 =x; 

(iv) x3 -— 14x? + 28x -7 = x°; 

(v) x 100 — 14x72 + 28x —7 = x10, 


(vi) (140 — 14x? + 28x — 7) (x69 — 42x? + 28x — 14) = x!0?, 


If you work it out, you will see that 


(x49 — 14x? + 28x — 7) - (x6 — 42x? + 28x - 14) 


is also x! And it works with addition, too. Details are in the exercises. 

Meanwhile, back at the ranch ... We want a test for irreducibility. As 
usual, we begin with an example. Consider the polynomial x* + 2 in Z[x]. 
How do you know that this polynomial is irreducible? You should check that 
there is no linear factor. To eliminate the possibility of quadratic factors, how- 
ever, requires a bit of calculation, which, although far from insurmountable, is 
doomed to limitation. Suppose, for example, we asked about the irreducibil- 
ity of x! +2. Eisenstein simply viewed the equation modulo 2. In Z2[x], 
the polynomial becomes x!°. If x!°°+ 2 were reducible, then one could write 
x100 4.2 = f(x)g(x) in Z[x], each having positive degree less than 100 and 
monic. On reducing modulo 2, in Z,[x] we would have 


x100+2= f(x)g(x), 


or 
x1 = F(x) g(x), 
and since factorization is unique, we must have, in Z2[ x], Gauss showed that if a 
polynomial is irreducible 
7) = a 4 rea n Qe < 100 in Z[x], then it is also 
L(x) =x ang 8) =X, i . irreducible in Q[x]. 


Lifting these equations back to Z[x], we see that f(x) must look like 
x" + Am _1x™ | + G_px"* + +++ + ag, with 2 dividing all the a;, and g(x) 
must look like x” + by_1x""! + by_px""* +--+ + bo, with 2 dividing all the b;. 
Thus 4 must divide agbo, which is impossible, because f(x)g(x) = x 10049. 

The argument is identical if we consider x! + 2x47 + 12x° + 2 or x10 + 
16x+2, because we only picked on ao and bo and required that the polynomial 
become x!°° in Z,[x]. You see how powerful the criterion is. It was crucial 
that the constant term was 2 and not 2”. On the other hand, it could have been 
6 or 2s for s odd. These observations lead to the following theorem due to 
Eisenstein. The proof is just as simple as the example, and we leave it to you 
as an exercise. 


Theorem 2.10 (Eisenstein’s criterion). Suppose that f(x) = x"+dan_,x""!+ 
Ay_ox" 7 +++» +9 isa polynomial in Z[ x] and that p is a prime number that 


divides each of the a; but p* does not divide ag. Then f is irreducible in Q{x]. 


Proof. This is Exercise 2.30. Have fun. |] 
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You will show this in 
Exercise 2.15. 


Take heart. We’ll prove 
both of these statements 
in the Supplement to 
Chapter 4. 
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Lookout Point 2.6. Here is a famous example of how Eisenstein’s criterion 
is applied: when p is a prime, f(x) = x?~! + x?-? +--- +1 is irreducible in 
Q[x}. 

But there aren’t any primes dividing the coefficients! A clever substitution 
comes to the rescue: It is enough to show that f(x + 1) is irreducible (as you 
will show in Exercise 2.31). We have seen this before for p = 5, but it works 
in general. Namely, f(x) = xe al So 


P 


(x+1)?-1 
x 2 


f(x+1)= 


= xP! + px? ( Jar eenep. 


Each of the binomial coefficients is of the form 


(") _ B(p=1)(p=2)--(p=j +1) 
i] i} 


where j < p. Look at the fraction on the right-hand side: p is a factor of the 
numerator, but p doesn’t divide the denominator. Hence p is a factor of ie ). 


And p’ is not a factor of the constant term (it is just p). So Eisenstein applies, 
and f(x) is irreducible. 


A couple of facts follow from this: 


Theorem 2.11. 
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(i) The minimal polynomial for Gy in Q[x] is 1 +x +x? +2004 XPT, 


(ii) If p is a prime, then Q{f, | (=Q(~) )) is the set of all linear combinations 
dag + at + al? teeet pat? 


with coefficients a; in Q. 


(iii) To close a loop that we opened earlier in this section, when p = 5, Q(és) 
is a vector space over Q with basis 1, é5, Mea ee 


What about a method for producing the minimal polynomial for ¢,, for any 
positive integer n, a polynomial that we will denote by ‘Y,,(x)? Instead of 
considering all powers of ¢,, just consider the powers ci, where (j,7) = 1, 
that is, where j and n are relatively prime. The number of such integers is 
denoted by ¢(n). The function ¢ (“Euler’s phi function”) shows up all over 
mathematics and has some beautiful properties. One of them is that (n) = 
ATI pin (1 - +), where the product is over all primes dividing n. The minimal 
polynomial for Z, turns out to be 


¥n(x) = II (2-4) 


G.n)=1 


It is not obvious that this polynomial is in Z[ x], but it is. Even less obvious is 
that it is irreducible in Z[x]. But it is. 
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Lookout Point 2.7. You should notice that 1 + x + x7 + +--+ x” is not 
irreducible for general n. It is irreducible when n = p — | for a prime p, as we 
have just seen. But consider 


x51 


x-10 


3 4 5 


l¢xt¢x? tao tatty = 


You can factor x° — | in two ways: as a difference of squares or a difference 
of cubes: 


a(x 1) (x8 +1) = (x- 1)? +44 1)(4+ D(X? - 2441), 
a 1= (x) = (x7 1) (x4 +x? +1) = (x-1)(x +1) (x4 42741). 


2 E} 4 5 


Either way, 1 + x + x* + x° + x" + x° is not irreducible. And there’s more: 
Comparing the two factorizations and invoking unique factorization, we must 
have 


(x4 4x7 41) = (x7 +.x41)(x?-x41). 


This could be computed directly by expanding the right-hand side and watch- 
ing things fall away. Or you can recognize that the left-hand side is a “differ- 
ence of squares in disguise”: 


att x? +1 = (044227 4+1)-2°. 


One moral of the story: It may seem natural to assume that x*++.x?+ 1 must 
be irreducible because h(x) = x + x + 1 is, and the quartic is just h(x). But 
the implication goes only one way: if p(x) is a reducible polynomial in Q| x], 
then so is p(x”) for every positive integer n, as can be seen by a substitution. 
But if p(x) is irreducible, then all bets are off regarding p(x”). It may or may 
not factor. 


Meanwhile, back at the ranch ... Because & satisfies 


(x +1) (x? +x41)(x?-x+1) =0, 


we see that [Q/ 2%]: Q] < 5. In fact, thanks to de Moivre, f& = 5 + es so 


& and its conjugate are roots of x? — x + 1 = 0 (check this), and the minimal 
polynomial for %& over Q is thus x* — x + 1. 


As a final example, let us consider the problem of constructing a regular 
2n 


7-gon with ruler and compass. As we did with the pentagon, let ¢ = cos = + 
isin a so that the regular unit heptagon inscribed in the unit circle in the 


complex plane looks like Figure 2.6. 
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This is a beautiful exam- 
ple, one that can be mined 
to illustrate the power of 
unique factorization in 


Ql]. 


The dimension of Q[ & | 
as a vector space over Q 
is called the degree of the 
extension. We denote this 
degree by [Q[ 6] : Q] and 
decorate the field tower 
diagram like this: 


Q(6) 
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Hint: Do you recall the 
rational root theorem from 
high-school algebra? 
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es 


aa 


76 


5 
g 
Figure 2.6. The regular unit heptagon. 


Just as we did in equation (2.7), let us look for the numerical value of the 
real part of Z; that is, look at + ¢~! = 2cos 2 It’s the same drill: Start with 
our favorite relation 


iter se Pal er 4 <0; 
On dividing through by 23, we obtain 
Cay +r aie ter ae H0, 
or 
Pee ee ae re ee a0, 


It begins to look suspiciously as though ¢ + ¢~! satisfies a cubic polynomial. 
In fact, 


(cep y -FsPvlnr) 
and 
(f40%)y = 407 +2, 


and solving these two equations for ¢? + ¢~> and ¢? + ¢~? respectively and 
substituting in the above gives 


(Py alee) ea es 1S, 
or 
Gar sar) ae =1=0, 


Hence ¢ + £7! is a root of the cubic 


eee See 1. 
You can show that this cubic is irreducible. 

Thus Q(z + C7) is a vector space of dimension 3 over Q. Since Q(Z) 
is of dimension 6 over Q (because | + x + ae ee aa irreducible), the 
rest of the extension, that is, Q(¢) over Q(¢ + ¢~!), must be quadratic (see 
Exercise 2.38). 


So, we have a field tower 


2.2 The Pentagon, Gauss, and Kronecker 


Q(g) 


Q(g+g71) 


Q 


Now for the punchline. Exercise 2.13 implies that a segment whose length 
is a quadratic irrationality (resulting in an extension of degree 2) can be con- 
structed with ruler and compass. A succession of such constructions always 
results in a field of degree 2” over Q. And it can be shown (see [19, Chap- 
ter 7], for example) that this is the whole story: a length can be constructed 
with ruler and compass if and only if it lies in an extension of degree 2” for 
some integer n. The side length of a regular heptagon results in an exten- 
sion of degree 3. Hence the heptagon cannot be constructed with ruler and 
compass. 

Incidentally, the other roots of x7 +.x?-2x-1 =O are £24273 and 274+.277 
(check this). Thus we have constructed a cubic polynomial that is irreducible 
and has three real roots. Can you think of an easier way to find an irreducible 
cubic with three real roots? 


Lookout Point 2.8. When Gauss was seventeen years old, he showed that 
it is possible to construct a regular 17-gon with ruler and compass; in fact, 
he outlined a method for carrying out the construction. A wonderful video 
shows David Eisenbud actually carrying out the steps.! 

Later, Gauss showed that a regular polygon with p sides is constructible if 
pis aso-called Fermat prime—a prime of the form 2” + 1 (such as 5 and 17). 


Exercises 
2.13 If nis a positive integer, show how to construct a segment of length \/n 
with ruler and compass. 


2.14 In the figure below, AABC, AB = BC = 1, and the measure of 2 B is 
36°. Find AC. 


' Available at https://youtube.com/watch?v=87u02TPrsl8. 
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Many details are missing 
here, but this is the basic 
idea. 


More details are in [19, 
Chapter 7]. 


We met Fermat primes 
before in the discussion 
after equation (2.10). 
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Hint: Use the result of 
Exercise 2.15. 


It’s a small world after all. 


2.15 


2.16 


2.17 


2.18 


2.19 
2.20 


2.21 


2.22 


2.23 


2.24 


2.25 
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B 


A Cc 


x"-1 
x-1 


2 n-1 


Show that if 7 is a positive integer, then =L]4+x4x°4+---4+x"" in 


Qla]. 


Find the value of 


Let f(n) = atar+ar*+---+ar", where n is a positive integer and a and 


r are real numbers. Use the result of Exercise 2.15 to find a closed-form 
formula for f (7). 


What is the side length of a unit regular pentagon. How would you 
construct it? 


Find the length of a side of a regular decagon inscribed in the unit circle. 


Show that the only automorphisms of Q(Vd ) are the identity and the 
mapping that sends a + bV/d to a — b\Vd, and these two mappings form 
an abelian group of order 2. 


Show that partial sums of 


are the ratios of consecutive Fibonacci numbers. 


In Section 2.2.1, we stated that in Q( 5), addition and multiplication are 
commutative and that all the usual rules of high-school algebra hold. 
Prove this. 


Establish the formal identity 


te + yy = (x? - yy + (2xy)? : 


Replace x and y by some integers and describe what you get. 


Show that 
(G3 - G+ 2és) (168 + 943 + 8é5 +3) =-1. 


Are there any polynomials in Q[.x] that have reciprocals in Q[x]? 
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2.26 Using polynomial arithmetic, characterize the set of polynomials in 
Q| x] that produce the same complex number when x is replaced by és. 


2.27 Let a= a + fs. Express 1/a@ as a polynomial in és. 


2.28 Find the minimal polynomial for Zg in Q[x]. How about £5? How about 
V2? Oh, and don’t forget \/2 + \/3. Try some other interesting algebraic 
numbers. 


2.29 Suppose p is a prime and f and g are polynomials in Z[x] Let f be the 
polynomial in Z,[x] that you get when you reduce all the coefficients 
of f modulo p. Show that 


ji) ftg=f+2 
(ii) fg=f-2 


2.30 Prove Eisenstein’s criterion (Theorem 2.10). 


2.31 Show that a polynomial f(x) € Z[x] is irreducible if and only if f(x + 
1) is irreducible. 


2.32 Show that 


(eee =i. - 


xP! pxP Pa (Par 3 aap. 
x 2 


2.33 Let ¢ = ¢5. Express each of these numbers in Q(¢) as a linear combi- 
nation of the basis {1, ¢, 27, raat 


(i) oA 
Gayo" 
(iii) £78 
(iv) 2¢7 
(v) 267+ 079 + ae 


(vi) more like these... 


2.34 Give an example of a polynomial p(x) in Q[x] with the property that 
p(x) and p(x?) are both irreducible. 


2.35 Calculate 6(n) for n = 1,...,50. Or go higher, just for fun. Conjecture 
some properties of ¢. Prove some of them. 


2.36 Can you show that 1 — x? + x* is irreducible directly without using the 
fact that it is the minimal polynomial for ¢)2? Go ahead. 


2.37 Show that x7 + x? — 2x — 1 is irreducible in Q[x]. 
2.38 Show that in a field tower with degrees like this: 
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A complete development 
of quadratic reciprocity 
can be found in [41], 
Chapter 5]. It requires 
some background from the 
earlier parts of that book. 


Caution: The standard 
notation for the finite field 
of integers modulo p is 
Z/pZ. It would make a 
long digression to describe 
the motivation for this, 

so we have adopted the 
nonstandard Zp. See [19, 
Chapter 7] for details. 
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E 


L 
the degree of E over L is mn. 


2.39 Show that it is impossible to construct (with ruler and compass) a cube 
with twice the volume of a given cube. 


2.40 Take It Further. If is a nonnegative integer, how many irreducible 
factors in Z[x] does x” - 1 have? (Fill in the table below to gather some 


data.) 
n | Number of Irreducible Factors of x” - 1 
1 1 
2 2 
3 2 
4 
) 2 
6 
7 
8 
9 
10 
11 
12 


2.3 Modular Arithmetic 


In this section we’ll show how the algebra of the last section can be extended 
to mod p considerations and give special cases of a famous result due to Gauss 
called the law of quadratic reciprocity. We won’t be able to prove this result, 
but you will acquire some experimental familiarity with it. 

You met the finite fields Z, in Section 2.2.4. We used p = 7, but any 
prime will do. Carrying on with the questions asked there, another important 
question to ask is this: what are the squares in Z7? On squaring everything 
nonzero, we get { 1,4,2 }, and so these are the three squares. 

Note that 2, which isn’t a square in Z, is a square in Z7. However, in Zs, 
the squares are { 1,4}, and 2 is not a square. How can one tell when 2 is a 
square in Z,? How about the set of all primes p for which 2 is a square in 
Zp? Is it infinite? Do they have a pattern? For the answers to these and many 
other really interesting questions, continue reading. 

The more general issue concerning “reciprocity” is simply the following. 
If p and q are distinct primes, is there any relationship between p being a 
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square in Z, and g being a square in Z,,? 

Back up a bit and fix a prime p. How does one know that Zp is a field? If 
a € Zp is not 0, we must show that there is an element b in Z, such that ab = | 
(recall that = means equality in Z,). But if a + 0, then (since 1 <a<p-l)a 
and p have no common factor. According to a basic result due to Euclid that 
we will prove in Chapter 3, there exist integers x and y such that 


ax+py=l1. 


But that just means that ax = 1 in Z, (if x happens to be outside the range 
1,...,p—1, just take x mod p). 

This argument proves that Z, is a field, and that deserves to be celebrated 
in a theorem. 


Theorem 2.12. /f p is a prime, then Zp is a field—every nonzero element of 
Zp has a multiplicative inverse in Zp. 


Next, string out the elements of 


25 = (OV A411 23.09=1} 
and consider { a, 2a,3a,...,a(p—1)}, a # 0 in Z,. The second sequence is 
the same as the first! Hence 


1-2-3-4-(p-1) =a-2a-3a-a(p-1) =a?"'(1-2-3--(p-1)). 


Canceling gives the basic result of modular arithmetic, due to Pierre de Fer- 
mat (1607?-1665), as stated in the following theorem. 


Theorem 2.13 (Fermat’s little theorem). [fa #0 in Zp, then a?! = 1 in 
Zp. Equivalently, if a € Z, p + a, then a?~! = 1 (mod p). In other words, p 
divides aP~' — 1. 


Using group theory, a bit of which we shall review later in this chapter, 
we could have proved Theorem 2.13 more compactly as follows: Since Z, is 
a field, Z, — {0} is a multiplicative group of order p — 1. By Lemma 2.25, 
a?-' =| forall ae Z, — {0}. 

Sometimes the theorem is stated in the form “if a € Zp, then a? = ain Zp.” 
Or equivalently, “if a ¢ Z, then a? = a (mod p).” And another formulation is 
worthy of statement as a corollary: 


Corollary 2.14. [fp is prime, then in Zp[x], 


p-1 


Xx 1 =(x-1)(x-2)---(x-(p-1)). 
The next, equally important, result is called the theorem on the primi- 
tive element. It was conjectured by Euler and used in his investigations, but 


although it isn’t very hard, the proof had to wait until Gauss came along. It 


A 


You can check that Z7 is 
a field just by looking at 
its tables for addition and 
multiplication. But you 
wouldn’t want to use that 
method for Z 0). 


Prove that the two 
sequences are the same 
using the fact that Zp is a 
field. 


There’s another very 
famous theorem asso- 
ciated with Fermat: 
Fermat’s last theorem. 
That wasn’t proved until 
the 1990s (not by Fermat, 
of course). See [19] for 
some of the history. 
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Recall that 
Zi = Zp — {0}. 


Caution: This is not a 
rerun of what we did 
earlier—this all happens 
in Zp. 


But notice that in Z7, 5 
is 4, and in Zy1, 5 is 6. 
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is the generalization of the fact that Z7 = { 3,37, ..., 36 I Namely, there is an 
element p € Z,, such that 


Zed lip eis Ps 


Let us assume this result for a while, as Euler did, and using arguments like 
those in the previous section, derive some interesting results. A proof will 
show up soon, in Section 2.5. 

The following observation is basic: Suppose n divides p — 1. Since Z> is 
cyclic, we can find a generator p as above. Then piP-D/ "= € is an element 
of Z,, that generates a cyclic subgroup of order n, namely 


se ag ag ae 


For example, 3°3° = 3? = 2 should generate a subgroup with three elements. 
And it does: { 1,2,4}. 

Let’s return to the argument we used to construct the regular pentagon and 
see how one adapts this to Z,,. Suppose that 5 divides p—1. Then one can find 
an element in Zp, call it £, such that £> = 1 and ¢ # 1. Then Zp has the five 
distinct elements 1, ¢, C7, £7, C+. It follows that 


Poet, feo, eer, 


where ¢~!, ¢~?, and ¢~> make perfectly good sense, since Zp is a field. Then 
just as before (equation (2.9) in Section 2.2), because 


O-1=(€-1)(14+64+0°+ 424) =0, 


we have 
14f74+27+4+2=0 
and 
Cae a 4k (2) 
Hence 


2 
(¢+27') +(f+07')-1=0. 
Hence ¢ + ¢~! satisfies the quadratic equation 
x*+x-1=0. 


Now the question is this: can we solve this quadratic equation in Z,,? Sure, 
because p # 2 implies that 2 has an inverse mod p. Call it 5. Then 


Hence in Z,, we have 
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In any event, V5 is in Zp(!) That means that 5 is a square in Z,. The condition 
on p was that p = 1 mod 5. 

Legendre introduced the following notation for this kind of statement. If 
a + O and ais a square in Zp, then one writes (S) = +1. If a is not a square 


modulo p, one writes Cs) = —1. Using this Legendre symbol, we can phrase 
the above statement as the following theorem. 


Theorem 2.15. /f p = 1 mod 5, then (3) =41. 


Lookout Point 2.9. Theorem 2.15 can be stated in a slightly different way. 
The statement amounts to p— 1 = 5t, or p = 5t + 1. In other words, the primes 
congruent to 1 modulo 5 are just the primes in the arithmetic progression 
{5t+1|t¢Z}, which consists of the integers 1,6, 11, 16,21, 26, 31,.... Our 
result says that 5 is a square modulo each prime p in that sequence (check 
this out for a few primes). 

An interesting question: How many prime numbers are there in the pro- 
gression 1,6, 11, 16,21,...? If there were infinitely many primes, then 5 would 
be a square in Z,, for infinitely many p. See Section 2.4 for more on the story. 

Another slight variation points to another interesting question: to say that 5 
is not a square in Z, is simply to say that x” — 5 is irreducible in Zp[x]. For 
example, x* — 5 is irreducible in Zy[x]; that is, it cannot be factored into 
linear polynomials in Z7[x]. On the other hand, x? — 5 is reducible in Z,;[x]. 
In fact, x” — 5 = (x —4)(x-—7). Indeed, 4 and 7 are the two square roots of 5 
in Z11. In general, it is very difficult to know when a polynomial with integer 
coefficients is reducible modulo p for a prime p. 


2.3.1 


We showed above that if p = 1(5), then 5 is a square mod p. When is p a 
square mod 5? The squares in Zs are 1 and —1, so if p is a square mod 5, then 
we must have p = +1(5). Is there a reciprocity here? That is, if p = —-1(5), 
is 5 also a square mod p? More generally, if p and g are primes, what the 
relationship between (4) and (¢)? 

The answer, the law of quadratic reciprocity, is one of the most beautiful 
results in arithmetic. It has several parts, so we state them one at a time: 


Quadratic Reciprocity 


Theorem 2.16 (Odd primes). Let p and q be distinct odd primes. If p or q 
is congruent to 1 modulo 4, then p is a square modulo q exactly when q is a 
square modulo p. If both are congruent to 3 modulo 4, then p is a square in 
Zq if and only if q is not a square in Zp. 

You should test some cases and show that the above statement can be writ- 
ten in the nicely symmetric form 


()(e)-coten. 
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Results of this type were 
arrived at experimentally 
by Fermat already in the 
early 1640s. 


Notice that 7+ 4 = 0 

in Z,. What is the sum 
of the roots of x? — 5 in 
Zp [x] for any prime p? 


The condition p = +1(5) 
is that p be of the 
form 5k + 1 or 5k + 4. 


Why does 4 play such 
an important role? That’s 
connected to arithmetic 
in the “Gaussian inte- 
gers,” which we take up 
in Chapter 3. 
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2 
(Z;) means (as the 
notation suggests) the set 
of squares in Z7,. 


See Chapter 7 of [15] for 
a geometric proof, essen- 
tially due to Eisenstein. 


Again, the details are 
in [41]. 
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Lookout Point 2.10. Here is one way to think about this. Say that two odd 
primes p and q have “positive reciprocity” if 


Dé (zt) <— qé (zx) : 
“Negative reciprocity” means (no surprise) 
pe (Z <— qf (zx) . 


Then quadratic reciprocity for odd primes can be summarized in a table: 


Prime type | 4n+1 4n+3 
4n+1 + 
4n+3 


Theorem 2.16 was first proved by Gauss (of course) in his masterwork Disqui- 
sitiones Arithmeticae [29], a book that laid the foundation for modern number 
theory. He loved the result so much that he gave eight proofs over his career, 
including one that uses geometry. A complete proof can be found in [41]. 

Theorem 2.16 answers the question for p and g both odd. What happens 
if one of the primes is 2? The question as to when 2 is a square in Z, can be 
partially answered by appealing to the octagon. 

Suppose that 8 divides p — 1. By the theorem on the primitive element, 
there is an element ¢ in Z, such that ¢ 8 = 1 and no lower positive power is 
1. Then (£*)? = 1, so + = +41. Since ¢*+ = 1 is excluded, we conclude that 
¢* = -1. Then just as before, 


(ery =P 4 =: (2.11) 


(This is Exercise 2.47.) Hence ¢ + €~! is an element of Zp whose square is 2. 
What have we shown? 


Lemma 2.17. /f p = 1(8), then (2) = +1, 


For example, take p = 17. Then 17 = 1(8), so (+) = +1. Indeed, 67 = 
36 = 2(17). A good exercise is to find a primitive ¢ modulo 17 and show that 
¢+27' must be either 6 or 11. (This is Exercise 2.46.) 

The complete answer about the nature of 2 is given by the following theo- 
rem. 


Theorem 2.18 (The quadratic character of 2). 2 is a square in Z}, if and 
only if p = +1(8). 


Another useful part of the story concerns the quadratic nature of —1. For 
which primes p is —1 a square in Z,? 
. . . = 2 . 
Suppose that p is a prime and that for some integer a, —1 = a“ in Z,. A 
clever idea: raise both sides to the pot power and use Theorem 2.13: 


p-l 


(-1)F = (a?) =a? 1 =1., 
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This is an equation in Z,. 


‘ Bed 
Now, in Z, we know that (-1) > = +1. And as long as p # 2, we have 
. : —1 is already a square in 
-1#1inZp,, so in Z, we must have Zn, SO we can assume that 


pF2. 
p-l 


(-1) 2 =1. 
This implies that Bt is even, and this implies that p = 1(4) (why?), from 
which the next lemma follows. 


Lemma 2.19. 


(=)-1 — p=1(4). 


And conversely, —1 is a square for all primes that are congruent to | mod 4. 
One way to see this is to use Corollary 2.14. A numerical example gives the 
basic idea. Suppose, for example, that p = 29. Then in Z9[ x], we have 


ee = 1s (Cay - 1) 
= (x4 = 1) ((x4)% + (x4)? + (x4)* + (24)3 + (24)? + x4 +1) 


= (x? -1)(x7 +1) ((x*)® +--+ +1). 


And by Corollary 2.14, the distinct roots of x78 — 1 = 0 are the nonzero ele- 
ments of Zo9, namely 1,2,3,...,28. Hence one (in fact two) of these elements 
must satisfy x7 + 1 = 0. 

What makes this work is that x? + 1 is a factor of x78 — 1. That depends on 
the fact that 29 — | is divisible by 4. So we have the converse of lemma 2.19: 


Lemma 2.20. 


p=1(4) = (=)-1. 


Putting the lemmas together, we have a pretty result: 


Theorem 2.21 (The quadratic character of -1). 


-1 p=1 
Goo" 
Pp 
Theorems 2.16, 2.18, and 2.21 combine to the give several parts of the law 
of quadratic reciprocity: 


Theorem 2.22 (The Law of Quadratic Reciprocity). 
(i) If p and q are odd primes, then 


()(2)-cator 


(ii) If p is an odd prime, then (2) =1 <> p=+1(8). 


p-l 


(iii) If p is an odd prime, then (+) =(-l1)7. 
(iv) ($4) =1. 
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Exercises 


2.41 Find a generator for Z), if 


G) p=11 
(ii) p=17 
(iii) p = 37 
(iv) p= 101 
(v) p= 109 
(vi) p = 1009 


2.42 Find V5 in Z1; and in Zy). 
2.43 Take It Further. Find \/5 in Zj011. 


2.44 Using the notation from Section 2.2.4, show that if a monic polynomial 
f is reducible in Z[x], then f is reducible in Z, [x] for every prime p. 


2.45 Show that if p is a prime and p = 1(8), then there is an element ¢ in Z, 
such that 28 = 1. 


2.46 Find a primitive ¢ in Z,7 and show that ¢ + ¢~! is either 6 or 11. 
2.47 Using the notation of equation (2.11), show that 


(feP) =P 42 42-2, 


2.48 Prove Corollary 2.14. 
2.49 Prove Lemma 2.20. 
2.50 Let p be prime. 


(i) Show that the product of two nonsquares in Z;, is a square in Z,,. 


(ii) (Euler) If uv and v are nonzero integers, show that 


)6)-(8) 


2.51 (Euler) Suppose p is an odd prime and p + a. Show that in Z,,, 


)0 


2.52 Prove Wilson’s theorem: If p is a prime, then 


(p= Dis=1(p), 


2.4 Supplement: Dirichlet’s Theorem on 
Primes in Arithmetic Progression 


In Lookout Point 2.9, we asked about the number of primes in the sequence 


1, 6, 11, 16, 21, 26, 31,.... 


2.4 Supplement: Dirichlet’s Theorem on Primes in Arithmetic Progression 


An examination of a large portion of this progression might suggest to the 
optimist that there are infinitely many such primes. This is indeed the case, but 
the proof is far from trivial. More generally, consider an arbitrary arithmetic 
progression a, a + b, a+ 2b,..., where a and b have no common factor. 
Peter Gustav Lejeune Dirichlet (1805-1859) proved in the 1830s that such 
an arithmetic progression contains infinitely many primes. His proof was a 
great triumph of complex analytic machinery, and it established Dirichlet as 
the leading mathematician in the world. He assumed Gauss’s chair at the time 
of the latter’s death in 1855. While Dirichlet’s proof is well beyond the scope 
of this book, it will come as a pleasant surprise that we can, with the aid of 
our results on the “modular” fifth roots of unity and a simple fact from group 
theory, establish the result for the 51 + 1 sequence 1,6, 11,16,21,.... 

In order to motivate the proof, recall the famous Euclidean argument for 
the existence of infinitely many primes. If pj,...,ps are distinct primes, con- 
sider their product increased by 1: p1---ps + 1. By the fundamental theorem 
of arithmetic, this integer has a prime divisor, say p. Then p must be distinct 
from p),...,Ds. (Why?) This gives a new prime. 

In order to generalize this argument to our situation, we step back a bit 
and look at Euclid’s argument from a higher vantage point. Euclid asserts 
that the progression 1,3,5,7,11,... of all odd numbers contains infinitely 
many primes. The number that makes the argument work, pj ---ps + 1, is the 
value of p---ps when it is substituted into the polynomial x + 1. Thinking of 
cyclotomic integers and polynomials (a bit of a stretch), note that 


For the 5n + 1 sequence 1,6, 11, 16,21,..., we use a similar argument, this 
time with the polynomial 
5 


xa 
=lextx?ex3 txt, 


» 


Now begin as above. Suppose pj,...,ps are s distinct primes all of the 
form 5n + 1. Our goal is to find another prime, distinct from pj,..., ps, of the 
form 5n + 1. 

To that end, we form the product 


@ = Spi p2"Ps 


2 3 


andlet€=l+at+a + a‘. This is a big integer, and so it of course has 
at least one prime divisor; call it p. We are going to show that p is the desired 
new prime. 

First, we will show that p is of the form 57+ 1. Since p divides 1+a+a7+ 


a +a%, it follows that in Zp, 


+@ 


l+tata+ae+a‘=0. 


Therefore, @ is a root in Z, of x°- 1 =(x-1)(1+x+2x7? +23 +24). 


We must show that a # | in Zp. If such were the case, we would have 
l+a+a*+a%+a*=14+1+1+4+1+1=0, which holds in Z, only for p = 5. 
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A complete proof of 
Dirichlet’s theorem can be 
found in [41]. 


See Exercise 2.53. 


You will see in a minute 
why we need to toss in 5. 
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It is a good idea to 
ameliorate the austerity 
with some concrete 
examples of your own. 
See [70] for inspiration. 
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But by the construction of a, we have € = 1 (mod 5), so 5 + €, whence p #5, 
and soa # 1inZp. 

It follows that a is one of the other roots, and so 1, a, a”, 2°, a“ are distinct 
elements of Z),, forming a subgroup of five elements. Now recall that if H is 
a subgroup of order h in a group G of order n, then h is a factor of n. This too 
is proved in Section 2.5. 

It follows that 5 divides p — 1, which is precisely the condition for p to be 
a member of the progression 1,6, 11,16,.... 

And p is surely distinct from pj,...,ps, since otherwise, it would divide 
a and hence divide 1 +a +a7+a°+a4- (a +e +art¢ ie), which is equal 
to 1. Thus the proof of Dirichlet’s theorem is a rather nice application of roots 
of unity and modular arithmetic. 

You should test your understanding of the argument by showing that the 
sequence 1, 4,7, 10, 13,... contains infinitely many primes. In this case, use 


lt+x+x?. 


Exercises 


2.53 True or false: If pj, p2,..., Pn is the set of the first n primes, then 


Pi\p2Pn+ 1 


is prime. 


2.54 Show that the sequence 
1,4, 7, 10, 13,... 


contains infinitely many primes. 


2.55 Show that there are infinitely many primes in the sequence defined by 
4n+1: 


1,559,138) 17.08 3 


Perhaps use the polynomial x? + 1. 


2.5 A Little Group Theory 


In the past several sections, we have used the language of groups. Here, we 
review the most elementary properties of finite groups and prove the impor- 
tant result on the primitive element that formed the basis of our arithmetic 
considerations. 


Lookout Point 2.11. The arguments in this section are intentionally aus- 
tere. We shall develop a few results in the theory of finite groups with no 
examples and no historical motivation. However, the arguments, as you will 
see, involve many of the same ideas from the first several sections. 


2.5 A Little Group Theory 


Recall that a group is a set G equipped with a binary operation, denoted 
here by -, satisfying the following axioms: 


G) (ye) = (ay) e. 
(ii) There is an element e € G such that x-e =e-x =x forall x. 


(iii) For every x € G, there is an element y € G such that xy = e = yx. 


You should show that there is only one e in a group and that the element y 
of item (iii), which is generally denoted by x~!, is also unique. Exercise 2.56 
asks you to show that xy = xw implies y = w, so that cancellation is possible. 


Lemma 2.23. Let a be an element of a finite group G. Then there is a small- 
est positive integer n such that a" = e. If a" = e for a positive integer m, then 


Proof. Start raising a to various powers: a, a’, a’, .... Since G is finite, it 
follows that a* = a’ for some s + t, 0 < s < t. Hence a’~* = e. Soa to some 
power is e. Let n be the smallest positive power such that a” = e. If a’ = e, 
then m > n. Divide n by m and use the division algorithm to write the result 
asm =no + p, where 0 < p<n. 

Then e = a” =a" - a. Since a” = e, this becomes a? = e. But 0< p<n, 
so the minimality of n forces p = 0. Thus m = n6 and n | m. | 


The n of Lemma 2.23 is called the order of a. 

A subset H of G is called a subgroup if H is a group with the same oper- 
ation. Let |H| denote the number of elements of H. It is called the order of 
H. If a has order n, then {e, a,...,a! } is a subgroup of G, called a cyclic 
subgroup. It has order n. (Thus the order of an element is the order of the 
cyclic subgroup that it generates.) 


Lemma 2.24. If H is a subgroup of a finite group G, then the order of H 
divides the order of G. 


Proof. If H = G, we are through. Otherwise, let a € G, a ¢ H. Consider aH, 
the set of all ah, h ¢ H. These elements are distinct from one another, and 
aH 0H = @, since ah, = hy would give a = hh, | € H. if aHuH =G, 
then stop. Otherwise, let b ¢ G, b ¢ aH, b ¢ H, and consider bH. Then 
bH 1 (aH UA) = @, and bH has |H| distinct elements. Keep going until you 
have exhausted the group. Each time, we have added |H| new elements. So 
|G| is an integer multiple of |H]. rT 


The sets aH are called left cosets of H. 


Lemma 2.25. Ifa «€ G and G is a finite group, then the order of a divides the 
order of G. 


Proof. H = fe, wig } is a subgroup of G, so apply the preceding lemma. 
a 
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Note that the operation - 
need not be commutative. 


The division algorithm 
codifies “division with 
remainder” for integers. 
Its proof can be found in 
Chapter 3 (Lemma 3.3). 


Note the overloading of 
the notation |-|. We have 
seen it used to denote 
absolute value, and here 
it is being used both for 
the number of elements of 
a group and the minimal 
positive power of a group 
element equal to the 
identity. 
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That m = s;t, follows, 
for example, from the 
fundamental theorem of 
arithmetic, which we will 
prove in Chapter 3. 


The relationship between 
factors and roots of 
polynomials shows up in 
high-school algebra in 
R[x], in Corollary 2.14 in 
Zp [x], and here in F[x]. 
What makes it work in all 
these systems? 


The first complete proof 
of Theorem 2.28 was 
given by Gauss (of course) 
in [29]. 


Chapter 2. Polygons and Modular Arithmetic 


Earlier in this section, we stated and used the result due to Gauss that 
Z;, (= Zp — {0}), viewed as a finite multiplicative group, is cyclic. We also 
noticed that every finite subgroup of C* = C — {0} is also cyclic, and its 
elements form a regular polygon when plotted in the complex plane. That was 
proved with the help of de Moivre and Lemma 2.25. However, as mentioned 
in Section 2.1, one can establish a general result about finite subgroups of any 
field whatsoever: they are always cyclic. 

To prove this, we first need a lemma on abelian groups, that is, groups in 
which multiplication is commutative: xy = yx for all x, y € G. 


Lemma 2.26. Let G be a finite abelian group and a and b two elements of 
orders s and t respectively. Suppose that s and t are relatively prime. Then 
the order of ab is s - t. 


Proof. Since a® = e, b' = e, and G is abelian, it follows that (ab)** = e. Now 
let the order of ab be m. Then by Lemma 2.23, we know that m | st. Since s 
and t have no common factor, it follows that m = s;t;, where s; | s and fy | t, 
so that e = (ab)” = (ab)*'". Then using a little fancy footwork, we have 


oe els = ((aby")"" 7 (ab)*" = gQtipst! = pst, 


Therefore, again by Lemma 2.23, we find that ¢ | st). Hence ¢ | t), since s and 
t have no common factor. Since f; | t, we see that ¢ =f}. Similarly, s = 5,. ™ 


We are now ready to prove the basic result. Suppose F is a field and Ga 
finite multiplicative subgroup of F* = F — {0}. If n denotes the order of G, 
write n = py™' p2™---py,°", where p1,..., Pm are distinct primes. Since G has 
order n, it follows that a” = 1 foreacha€éG. 

By basic algebra, x” - 1 = []geg(x - @) is in F[x]. If c | n, then x° - 1 
divides x” — 1. It follows that x° — 1 has c distinct roots in G. Hence for each 
i from 1 to m, xP; — L has pi roots in G. Let 8; be a root of xP" — | that 


F .aj-l : et 

is not a root of x?! — 1. Then since Bri = 1, the order of 6; must be 

a power of p;. But that power cannot be less than a;, or else one would get 
a;-1 . 

B;Pi — = 1. Hence the order of 8; is p;™. But py™,...,pm°™" are mutually 

relatively prime. Hence by an inductive generalization of Lemma 2.26, we 

conclude that 8; - 82---Bm has order pj™ - p2®---pm°™" =n. This means that 


G is cyclic! This deserves to be celebrated as a theorem. 
Theorem 2.27. /f F is a field, then every finite subgroup of F* is cyclic. 


As a special case, we finally state the theorem that was so essential to our 
investigation of cyclotomy. 


Theorem 2.28 (Theorem of the primitive element). The multiplicative 
group of a finite field is cyclic. In particular, if p is a prime, there exists an 
element p € Z, such that 
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Incidentally, we have retrieved the existence of polygons without trigonom- 
etry. By the fundamental theorem of algebra, we know that x” = 1 has n 
roots in C. They form a group, so the group is cyclic. Let 2; be a gener- 
ator, so the group’s elements are Z; = c. i = 1,...,n. Since Z" = 1, we 
have |Z;| = 1 for all i, and so these n elements sit on the unit circle. Finally, 
IG - Gail = ai’ cael = [Zi'||1 ‘all = |1 — &|, and so the G; are all the same 
distance apart! 


Exercises 


2.56 Show that in a group G, cancellation is possible; that is, if xy = xw, 
then y =w. 

2.57 True or False: If G is a group and x,y € G, then one always has 
(xy)! = x7!. yl. If it is true, prove it. If it is not, salvage it with 
a fix. 


2.58 Suppose that F is a field and c,n are nonnegative integers. 


(i) Show that (x° - 1) | (x” - 1) in F[x]. 


(ii) Is the converse true? 


2.59 Prove all the assertions made in the proof of Lemma 2.24. 


2.60 The proof of Theorem 2.27 makes several assertions. Prove them all. 
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In 1959, Helmut Wielandt (1910-2001) published a very simple proof of a 
general theorem in group theory due to the Norwegian mathematician Peter 
Ludwig Mejdell Sylow (1832-1918) [88]. The theorem states that if n is the 
order of G and p* | n for a prime p and positive integer s, then there is a 
subgroup H of order p*. The standard proofs prior to Wielandt’s were con- 
sidered somewhat difficult for beginners. Many textbooks on algebra and ele- 
mentary group theory now contain an exposition of Wielandt’s proof. In this 
section, we develop this point of view and prove a number of results about 
finite groups. 

The basic notion is that of a group G operating ona set S. In other words, if 
g¢€Gands€S, then g(s) denotes an element of S. Picture it like Figure 2.7. 

The group structure of G is involved in two axioms that we shall impose 
on the action of G on S: 


1. In the first place, we want the identity e of G to operate on S like an 
identity: e(s) = s forall s€S. 


2. Our second condition states that multiplying in G corresponds to com- 
position of the action on S. In symbols, this is the requirement that for 
£1,82 € G, we should have 


(g1 - 82)(s) = g1(g2(s)). 
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Recall that |¢| denotes the 
absolute value of 2. 


The action of G on S 
equips G with additional 
structure: elements of G 
become functions with 
domain S and image in S. 
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You should also check that 
o(e(a, b)) =a -e(ab), 
and so on. 


G cyclically permutes the 
elements of S. 
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Figure 2.7. Every element g € G sends s € S to g(s) €S. 


We then say that G operates on S and that the action of G on S is good. 
Let us give a few examples. 


Example 1. Fix a set H. Let S be the set whose elements are pairs (a, b) 
with a, b ¢ H. For the group G we take a cyclic group with two elements, say 
{e,o}, 0 = e. We shall let G operate on S according to the prescription 


e(a, b) = (a,b), 
o(a,b) = (b,a). 


To check that G really operates on S we need, by axiom 2, to see that 
o(a(a,b)) = 07 (a,b), 
which is immediate, since 0” = e and 7(o(a,b)) = 7 (b,a) = (a,b). 
Example 2. Do the same as above, but put 


S=HxHx:---xH, 
——__,_ -—__ 
n 


and let G be a cyclic group with n elements, G = {e, o,07,...,.07! 1 
o-” = e. How shall G operate on S? Define 


e(a1,d2,...,dn) = (a1,42,...,4n), 
o (a4, a, oa -,An) = (dn, Q1, 2, « . > An-1)s 
o°(a1,42,---,4n) = (Gn-1,4n; 41, 42, .--,4n-2), 
a”! (ay, a2, w+ 5n) = (2, 43, 44,...,An, a1). 


Check that G operates on S. We shall see a little later how this simple action 
gives a swift proof that when p divides the order of a group G, for a prime p, 
then there is an element of order p in G. 


Example 3. Suppose G already operates on a set T. Let S be the set of all 
subsets of T. If A € S, then define g(A) = { g(t) |t ¢ A}. Then g(A) € S and 
G operates on S (check this). 
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Example 4. This time, let G be a group and let the set that we usually call S 
be G as well. That is, we are going to let G operate on itself. If g € G, then 
define g(s) = g-s-g ! for all s € G. Then e(s) = ese”! = 5 for all s, and we 
have 


(g1- g2)(s) = (gig2)s(gig2) | = g1g28g3'g7' 
= g1(g2595')g1 | = 81(g2(s)).- 


We say that G operates on G by inner automorphisms. 


Example 5. Let G be any group and H a subgroup of G. We let S be the set 
of left cosets aH = { ah | h ¢ H}. The action of G on S is given by 


g(aH) =gaHeS. 


Example 6. This is the one used in proving Sylow’s theorem. Here G is a 
fixed group, s is a fixed positive integer, and S is the set whose elements are 
subsets consisting of p* elements of G. If A € S, then define the action by 
g(A) = {ga|aeA}. You can show that g(A) € S and G operates on S. Try 
it. 


Each of the above examples will be used in later applications. 


If G operates on a set S, then we define the orbit G(s) of an element s € S 
by G(s) = {g(s) | g € G}. In Example 4, the orbit of an element a is called 
the conjugacy class of a. A basic result is the following lemma. 


Lemma 2.29. Two orbits are either identical or disjoint. 


Proof. It is enough to show that if two orbits G(s) and G(s’) have acommon 
element, then G(s) = G(s’). But if g(s) = g’(s’), then by definition of G 
operating on S, we have s = e(s) = g-'g(s) = g-!g'(s’). Hence s € G(s), 
and it follows that G(s) c G(s’). By symmetry, G(s’) c G(s). a 


It follows that the action of G on S breaks S up into disjoint orbits. For 
applications, it is important to have some way of obtaining information on 
the number of elements in a given orbit. For example: 


(i) If g(s) = s for all g € G, then G(s) = {5}, and the orbit has just one 
element. 

(ii) In Example 2 above, take H x H x H. Then (a,a, a) has an orbit of one 
element, while (a, b, b) has an orbit of three elements: (b, a,b), (b, b, a), 
and (a, b, b). 


Another example: If we let a cyclic group of order 6 operate on H x H x H x 
H x H x H as in Example 2, then (a, b,a,b,a,b), a # b, has an orbit of two 
elements. Which elements of G = { e,o,0°,0°7,04,0° } leave (a, b, a, b, a, b) 
fixed? Answer: { eo", ot } which is the subgroup of G of order 3. We call 
{ e,o",o* } the stabilizer (or the isotropy group) of (a, b, a, b,a, b). 


53 


(g1 - 82)(aH) 
= g1-(g2- aH) 
= g1(g2(aH)). 


What is the stabilizer of 
(a, b, c, a, b, c) if a, b,c 
are distinct elements 

of H? 
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|G(s)| denotes the 
number of elements in 
G(s). 
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Let G operate on a set S. The stabilizer, or the isotropy group, of an ele- 
ment s € S is the set J; = {g € G| g(s) = 5}. The set J, is a subgroup of G. 
Thus the stabilizer of an element s of S is the subgroup of G of elements that 
don’t move s, i.e., that leave s fixed. 

In the above examples, notice that the number of elements in the orbit of 
an element of a group G multiplied by the number of elements in the stabilizer 
of that element is equal to the order of G. 

If G, is a subgroup of a finite group, then the index of G; in G, denoted 
from now on by [G: G;], is, by definition, the number of left cosets of G;. 
Lemma 2.24 tells us that if the order of G; is m and the order of G is n, then 
m divides n. Thus [G : G,] = n/m. 

In the above examples, we can use this vocabulary to rephrase “the number 
of elements in the orbit of an element multiplied by the number of elements 
in the stabilizer of that element is equal to the order of G” as “the number of 
elements in the orbit of s €¢ S is equal to the index of its stabilizer.” This is 
true quite generally. 


Lookout Point 2.12. There is a way in which the above statement makes 
some intuitive sense: The real action of G on an element s happens outside 
the action of the stabilizer of s—the stabilizer just leaves s alone. In a sense, 
to see what really happens to s under G’s action, you can “mod out” by J;. 
This is the same principle as viewing Z7 as Z, ignoring multiples of 7. 


More precisely: 


Lemma 2.30. /f I, is the stabilizer of an element s € S, then the number of 
elements in the orbit G(s) of s is equal to the index of I, in G. In symbols, 


IG(s)[=[Gils], or |G(s)|- [Us| = IGl. 
Proof. Write a left coset decomposition of J, in G: 
G = gils U gals Us U Sms . 


Then m is the index of J; in G, by definition. To prove the lemma, it is enough 
to show that the set { g1(s), g2(s),.--,8m(s) } comprises precisely the dis- 
tinct elements in the orbit G(s). And indeed it does. Here is why. 

First, the elements are distinct, for g;(s) = g;(s) implies gj. gi(s)=s 
implies 8) Si € I, implies g; € g;/;, which is impossible. (Why?) 

Next let g(s) € G(s). Then using the above decomposition, you can say 
that g = g;h for some i and h € I,. Then g(s) = g;h(s) = g;(s), an element of 
our alleged orbit { g1(s), g2(s),.--, %m(s) }. Done. rT] 


Corollary 2.31. With notation as in the lemma, both |G(s)| and |Is| divide |G]. 


Let us put Lemmas 2.29 and 2.30 to work. 


2.6 Orbits and Elementary Group Theory 


Application 1. Let H be a finite group and let p be a prime dividing the 
order of H, which we denote by n. A theorem due to Cauchy states that there 
is an element of order p in H. The following “orbit” proof is due to James 
McKay [55]. 

Consider the set S of all p-tuples (a1, a2,...,ap)) with a1,...,ap) € H and 
@|:2**Ap = e. As in Example 2, define a cyclic permutation o of the elements 
of S by 


(a1, 42,...,4p) = (dp, a1, 42,...,ap-1)- 


Let G, denote the cyclic group fe, HE iy oe } of order p. Then G, 
operates on S. 

We must check, of course, that G, is a good action. In Example 2, you 
saw that the action satisfies the two properties from Section 2.6. We also need 
to check that (dp, a1,...,d@p-1) € S. But ay - az---dp = e, SO ay---dp-| is the 
inverse of dp. It follows that ap,a)a---dp_| = e. 

Now let us look at orbits. If the orbit of (a1,...,a)) has only one element, 
then a, = a2 = --- = dp, So that a,” = e and a is an element of order p. If 
s € S has an orbit with more than one element, then J, is a proper subgroup 
of G,,, and that forces J, = { e }, since p is prime. This means that G(s) has 
p elements. 

Hence either the orbit has one element or p elements. Now use Lemma 
2.29. The set S is partitioned into disjoint orbits. If there are k elements with 
orbit just a single element, then the number of elements of S is k + pt for 
some whole number ¢. But for each choice of aj,...,dp)-1, there is a unique 
dp such that aja2,...,ap) = e. Thus the number of elements in S is nol, 
Hence n?~! = k + pt, from which it follows that p | k, since p | n. Hence 
k > 1, and there is an element of orbit a single point besides (e,¢,¢,...,e). 
That gives an element of order p. 


Lookout Point 2.13. This argument is a direct generalization of the argu- 
ment that shows that a group with an even number of elements has an element 
of order 2: Each element a that is not of order 2 can be paired with its inverse 
a! + a. Thus there is an even number of such elements. What remains are 
the identity and the elements of order 2. Since the group has an even number 
of elements altogether, it must have an odd number of elements of order 2, 
hence at least one such element. 


Application 2. Let G be a group. If a € G and ag = ga for all g € G, then 
we say that a belongs to the center of G. The center of G is denoted by Z(G), 
and you can see that Z(G) forms a subgroup. Here is another way to describe 
Z(G). Put S = G and let G operate on G by inner automorphisms (Exam- 
ple 4). Thus g(s) = gsg~!. What is the isotropy group /,? By definition, 


i,={ ge |gse =s}={¢|es=se}: 


Thus /, is the subgroup of all elements in G that commute with s. If J; = 
G, then every element commutes with s, and we have s € Z(G). In other 
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Much of this argument 
uses Lemma 2.24. 


Thus the center of G is all 
of G if and only if G is 
abelian. 


Z is for the German word 
Zentrum = center. 
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For example, 
¥5 (2625) = 3. 
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words, Z(G) is the set of single-element orbits. When does Z(G) contain 
more elements than just e? A partial answer is given in the following lemma. 


Lemma 2.32. /f the order of G is p", for a prime p, then Z(G) # {e}. 


Proof. It suffices to show that the number of single-element orbits is greater 
than 1. To that end, we decompose G under the above action into disjoint 
orbits. Each orbit G(s) contains either one element (an element of Z(G)) or 
p’ elements for some j (Corollary 2.31). Counting, we have 


p" =|G|=k+pt, 


where k > 1 is the number of single-element orbits. We then have p | k, 
whence k > p, and therefore Z(G) # { e }. a 


Application 3. Sylow’s theorem. Let G be a group with n elements and sup- 
pose that p* | n, where p is a prime. We will locate a subgroup of order p*. To 
do this, we let S be the set whose elements are subsets of G with p* elements. 
The number of elements in S is 


(") _ t= aera +1) 


We need to know the highest power of p dividing ( y The answer: Put 


n 
pe 


n = p'mwith p + m. Then the highest power is p’ *. This is seen by matching: 


(<2 (pim- 1) (p'm=p) | (pim- (p= 1) 


wy pel UE = py Op = (pte 1) 


You can show (Exercise 2.63) that each fraction except the first has the same 
power of p in the numerator and denominator. Thus the answer is t — s. 

Now let G operate on S as in Example 6, namely, for A € S and gA = { ga | 
a € A}. Check that gA also has p* elements and that G is a good action on S. 

Since p** is the highest power of p dividing the number of elements of 
S, and since, by Lemma 2.29, the orbits disjointly partition S, it follows that 
there must exist at least one orbit whose number of elements is not divisible 
by pe, 

Denote this distinguished orbit by G(a), where a is a fixed element of S. 
The main point of the construction is this: the stabilizer Jy of @ in G is a 
subgroup with p* elements. Let us prove this. By Lemma 2.30, we have 


|G(@)|- Lal = |G]. 


Also, if r is a positive integer, let v,(r) denote the highest power of p 
dividing r. Then 


vp|G(@)| + vplal = vpIGI. 
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By definition of G(a), we know that 

vpIG(a)| <t-s. 
Hence since v,|G| = t, we see that 

t-S+Yp|Iq|>t, Recall that n = p'm with 

ptm. 
or 
Vplal > 5. 

Thus |Zq| 2 p’. 


In order to show that |J_| < p*, we make the following observation. Write 


a= { 81, 92,---,8ps }. 


If g is an arbitrary element of J,, we know by the definition of J, that ga = a. 
In other words, 


{ 81,82). «+s 8p> } = { 881,88m---» 88s } 


(as sets!). Hence gg, = g; for some j, with 1 < j < p*. But then g = £8115 
which shows that there are only p* choices for g. Hence |I,| < p*. We con- 
clude finally that |Z,| = p*, which proves Sylow’s theorem. 


Theorem 2.33 (Sylow’s theorem). Let G be a group with n elements and 
suppose that p* 


n, where p is a prime. Then G has a subgroup of order p’. 


Exercises 


2.61 Fill in the details for the claim in the proof of Sylow’s theorem that 


pim 
Vp ps =ft-s. 


2.62 If p is a prime and a and Db are positive integers, show that 


¥p(ab) = vp(a) + vp(b). 


2.63 Referring to the proof of Sylow’s theorem, show that each fraction 
except for the first has the same power of p in the numerator and denom- 
inator. 


®) 


Check for 
updates 


The Fundamental Theorem of 


Arithmetic 


In Chapter 2, we made extensive use of the fact that every positive integer 
can be written in one and only one way as a product of powers of distinct 
primes. This property of Z is basic to mathematics. It is so basic that many 
people don’t even think to question it. This is especially true in school, where 
students spend much of elementary school working with integers, using this 
unique factorization property as if it were a law of nature. For example, young 
children build “factor trees” for whole numbers, and it is usually taken for 
granted that two different trees, like those in Figure 3.1, end up with the same 
set of prime factors. 


/\. #* 
JN AN A 


2 2 5 


Figure 3.1. Two factor trees for 60. 


Assuming unique factorization isn’t the sole province of beginners. As we 
shall see in Section 3.2, accomplished mathematicians assumed that rings of 
cyclotomic integers enjoyed unique factorization and ended up with flawed 
proofs of a longstanding conjecture, the famous “Fermat conjecture” (read 
on). And attempts to fix the flaws contributed to the creation of modern alge- 
braic number theory. 

But happily, our old friend Z enjoys unique factorization, and that is what 
we take up in this chapter. 
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And later in school, it is 
usually taken for granted 
that two different factor- 
izations in Z[x] produce 
the same irreducible 
factors. 
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Insisting that primes be 
positive eliminates the 
need for some fussiness. 


It is a great experiment to 
ask a youngster whether 
13 divides 2x3x5x7x 11. 
Many kids will perform 
the multiplication, divide 
by 13 and (hopefully) 
obtain a nonzero remain- 
der. 


The existence of a minimal 
element among the a — bq 
follows from the “well- 
ordering property of 

Z,’ discussed in [19, 
Chapter 1]. 
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3.1. Getting Started 


We never defined the word prime in Chapter 2 (whoops). To set things right, 
a positive integer distinct from | is said to be prime if it has only 1 and itself 
as positive divisors. From this definition it is not immediately clear that every 
positive integer different from 1 has a prime divisor. But it’s true: 


Lemma 3.1. Let n> 1, n € Z. Then there exists a prime p such that p | n. 


Proof. The integer n has divisors bigger than | (itself, for example). Let m 
be minimal in the set of divisors of n larger than 1. If m were not prime, then 
we could write m = ab, 1 < a<_m(a€Z). But then we would have a | m and 
m | n, from which we conclude that a | n, which contradicts the minimality 
of m. a 


Lemma 3.2. Let n > 1, n € Z. Then n can be written in the form 


N= P\p2°'Ds, for primes p\,...,Ps- 


Proof. We proceed by strong induction. Check the first few cases, say n = 
2, 3,4, 5,6. If n is prime, we are done. So assume that n is composite. Choose 


pi | n by Lemma 3.1. Then 1 < n/p; <n, so by induction, there are primes 
P2,---»Ps Such that 

n 

— =/p2"'Ps. 

P\ 
But then 

N= P\P2Ps- 


On grouping the distinct primes, we see that one may write n = p}'---p?* 
for distinct primes pi,...,ps. But here is the catch. Suppose p | n for a 
prime p. How do you know that p = p; for some i? The proof is by no means 
trivial—it requires a chain of lemmas, each of which is important in itself and 
which, taken together, determine the algebraic structure of Z. Here we go. 


Lookout Point 3.1. Before we carry on, let us tie up another loose end 
from Chapter 2, one that is essential to what follows. 


Lemma 3.3 (The division algorithm). /f a is an integer and b is a positive 
integer, then a = bq +1, where q and r are integers withO <r < b. 


Proof. Consider the set of all a — bj, where j ranges over Z. This set has 
nonnegative members (right?). Let g be such that a — bq > 0 and a — bq is 
minimal among the nonnegative members. Put a — bq = r. Claim: r < b. 
Suppose to the contrary that r > b. This would imply that a — bq > b. Then 
a-(q+1)b>0. But a-(q+1)b<a-4qb, since b > 0. This contradicts the 
minimality of a — bq, sor < b after all, and that establishes the lemma. | 
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Another way to think about this is to consider the rational number a/b. It — The fact that every real 


gets caught between two consecutive integers, q and q + 1, say. Put ¢ —q = w. number is caught between 
two consecutive integers 


This makes |w| < 1 (why?), and it is rigged to make igealledithe Archiniedean 
property of R. It implies 
a= bq +bw. that there exist real 
numbers arbitrarily large 
But in absolute value. Young 


children get used to this 
when they play with 
|bw| = |b| |w| < |B]. the “number line.” Not 
every useful field is 
So we take r to be bw. Archimedean [45]. 


Back at the ranch, consider two positive integers a and b. They have a 
common divisor, namely 1. And since every divisor of a and b must be less 
than max{ a,b}, one can consider their largest (aka greatest) common divi- 
sor. Call it d. Our first goal is to show that if m is a common divisor of a and 
b, then m divides d. To do this, we show that d has an amazing property: it 
turns out that the set of Z-linear combinations of a and b coincides with the 
set of multiples of d. We capture the greatest common divisor “linearly” by 
the following fundamental lemma. 


Lemma 3.4. Let a and b be integers. Let 
aZ+bZ={axt+by|x,yeZ}. 


Then: 


(i) There is a unique nonnegative integer d satisfying 
aZ+ bZ = dZ. 


(ii) If m|aand m| b, then m | d. Hence d is the greatest common divisor of 
a and b. 


Proof. (i) If a = b = 0, let d = 0. Otherwise, aZ + bZ has positive elements. 
Let d be positive and minimal in aZ + bZ. Then dZ © aZ + bZ. To get the 
reverse inclusion, suppose that m € aZ + bZ. Write m = ds +r, where 0 < 
r < d. Then one sees that m — ds € aZ + bZ (because aZ + bZ is closed 
under addition and subtraction). Hence r € aZ+ bZ, which forces r to be zero 
(why?). Therefore, d | m. Hence m is a multiple of d, and we have established 
the reverse inclusion: 


aZ+ bZ¢ dZ. 


(ii) Since a € aZ + bZ, we have a € dZ, which means that d | a. Similarly, 
d | b.Ifm | aandm | b, then m divides every member of aZ+bZ. In particular, 
it divides d, since d € aZ + bZ. | 


Lookout Point 3.2. Lemma 3.4 is responsible for a very suggestive piece 
of notation: if a1,...,@m are integers, we let (a1,...,@m) denote the set 


aZt+arZt+-:++adyZ. 
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Yes, (dj,.--,@m) often 
denotes an n-tuple. 
Context matters. 


If 6 divides a product ab, 
must it divide a or b? 


“Essentially one way” 
means that the list of 
prime powers in such a 
factorization is unique up 
to the order in which they 
are listed. 
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Using this convention, the lemma says that if d is the greatest common divisor 
of a and b, then 


(a,b) = (d). 


This is an equality of sets. If we want to refer to integers rather than sets, the 
convention is to write 


d = gcd(a,b). 


But (again, context matters), we will sometimes refer to (a, b) as the greatest 
common divisor of a and b. There is more to the story behind all this (see 


[41]). 


We are now in a position to establish the result that if a prime p divides a 
product, then it divides one of the factors. 


Theorem 3.5 (Euclid’s lemma). Let p be prime. If p | ab for integers a and 
b, then p divides at least one of a and b. 


Proof. Suppose that p + a. Since p is prime, p and a have greatest common 
divisor 1. By Lemma 3.4, we have ax + py = 1 for suitable integers x and y. 
Multiply by 5 to get ab into the act. This gives 

abx + pby=b. (3.1) 
But p | ab and p | p. Therefore, p divides the left-hand side of equation (3.1), 
so it also divides the right-hand side: p | b. Done. | 


We conclude immediately the following: 


Lemma 3.6. [fn > 1 is written n = p{'--p$*, where p,,...,pPs are distinct 
primes, and if a prime p divides n, then p = p; for some i = 1,...,5. 


a,-1 


Proof. Write n= p,- (Pi pe). Then either p | p;, in which case p = pj, 


a-l. 


or else p | p' -pg*. Continue in this way. | 


The fundamental theorem of arithmetic follows in the same spirit: 


Theorem 3.7 (The fundamental theorem of arithmetic for Z). Every inte- 
ger can be written as a product of primes in essentially one way. 


Proof. The existence of such a factorization is the content of Lemma 3.2. 
On to uniqueness: Suppose that p{'---p¢> = re where the p; and q; 
are primes, p; + p; fori + j, and g; + q; fori # j. We claim that s = ¢, 
{Pi.--»Ps}={4i.---.4 }, and if p; = q;, then a; = £;. 

Well, Lemma 3.6 implies that the sets { p1,...,ps } and {q1,.. 
the same. Hence s = t. After relabeling, we have 


Py Ps" = ph ...pht : 


qe } are 


3.1 Getting Started 


Suppose, however, that a; + 6; for some i. If, say, a; < §;, then cancellation 
i141, 


shows that p; divides the product p{'---p*"7'p;i}'--p¢s. This contradicts the 
previous lemma. | 


There are various ways to organize the above steps depending on how you 
like to write out formal inductions. The real point is Theorem 3.5. 


This completes the basic result. By restricting ourselves to positive primes 
and positive integers, we have avoided the harassment of the unit —1. In more 
general rings, however, you have to live with the units. The integers +1 are 
the only integers whose inverse in Q is actually in Z. They form a subgroup 
of order 2. In more general rings, however, where arithmetic still plays an 
important role, there may be many units. For example, in Z[&], there are six 
units: 


1, 66 es ee & ee 


And it gets worse (or better, depending on your tastes). Consider the ring 
Z[ V2]. Its elements look like a + b\2, where a and b are in Z. In Z[ V2], 
1+ V2 has inverse —1 + /2, which is in Z[/2]. So (1 + V2)", where n is a 
positive integer, will have inverse (—1+ re Since 1+\/2 + 1, the sequence 
{(1 + V2) V's —oo <n< oo, will give an infinite cyclic group of units! 

Lemma 3.4 is often sufficient to prove results that also follow from the 
fundamental theorem. Here is an example. 


Lemma 3.8. Jf a | st and a and t have no common factor bigger than 1, 
thena|s. 


Proof. Since a and ¢ are relatively prime, one can find x and y such that 
ax +ty = 1. Multiply by s to get asx + sty = s. Thena|aanda | st, so 
a|s. a 


Can you show that a | cd, where (c,d) = 1, implies a = pv, where ys | c 
and y | d, using only Lemma 3.4 and not the uniqueness argument of Theo- 
rem 3.7? 


3.1.1 Computing Greatest Common Divisors 


Lemma 3.3 is the basis of an algorithm for calculating the greatest common 
divisor of two integers, an algorithm that is simple (and enjoyable) to carry 
out by hand and is easily programmed in any programming language that 
supports recursion. 

Greek mathematicians used a process called antanairesis, a free transla- 
tion of which is “back and forth subtraction,’ when they realized that one 
consequence of the arithmetic structure of the integers is that 


if a <b, then gced(a, b) = gcd(b-a,a). 


In repeated applications of this process, we can replace subtraction by divi- 
sion with remainder. 
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The complex roots of 
unity were defined in the 
previous chapter. 


For example, see [18]. 
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Lemma 3.9 (Euclid’s algorithm). [fa and b are positive integers with a < 
b, and 


b=aq+r, O<r<a, 
then 
gced(a,b) = ged(r, b). 


The proof is up to you (Exercise 3.1). 

Repeated applications produce a wonderful rhythm. As an example, we 
illustrate one way to organize the steps that has been effective with students. 
Arrange the steps in computing gcd(124, 1028) as on the left: 


8 4=36-2-16 
124) 1028 \ 
992 3 = 36-2-(124-3-36) 
= =-2-124+7-36 
36 |} 124 
N 
108 2 


-2-124+7- (1028 -8- 124) 


16 } 36 =7- 1028 — 58-124 
32 4 
4} 16 
16 
0 


The last nonzero remainder is the greatest common divisor, so we have 
gcd(124, 1028) = 4. This arrangement can be used (on the right) to read off 
the coefficients s and ft, so that we have 4 = 124s + 1028t. Start at the next-to- 
last division and solve for each remainder. 


Lookout Point 3.3. In fact, you can check that the two recursively defined 
functions 


if a =0, 
s(a,b) = ; 
t(r,a)-—qs(r,a) otherwise, 
and 
1 if a =0, 
t(a,b) = _ 
s(r,a) otherwise, 
where b = aq +r is as in Lemma 3.9, calculate two integers such that 
s(a,b)a+t(a,b)b = gced(a,b). 


Model them in your favorite programming language and check them out. 
Then figure out how they mimic the “start at the next-to-last division and 
solve for each remainder” algorithm stated above. 


3.1 Getting Started 


3.1.2 Modular Arithmetic with Polynomials 


There is a deep structural similarity between the rings Z and R[x]. The key 
lever in this similarity is that there is a division algorithm—given two poly- 
nomials g and f in R[x], you can divide g by f and get a smaller remainder. 
The measure for “smaller” is now the degree, so that the division algorithm 
becomes the following lemma. 


Lemma 3.10. Let f(x), g(x) € R[x]. Then there exist q(x),r(x) € R[x] 
with g=aqfrr, 
deg(r) < deg(f). 


It is most likely that you practiced the execution of this algorithm in high 
school. Just for old times’ sake, finish off the rest of this calculation: 


4x3 — 14x? 


4x9 — 2x4 + x3 


4x? + 12x* — 8x3 
14x° + 9x° 


x7 +3x-2 


The results about Z in this section carry over with only slight modification 
to R[x]. For example, every polynomial can be factored into irreducibles, 
and the factorization is essentially unique up to order and unit factors. Oh, 
and what are the units in R[x]? You can check that if g and f are in R[x], 
then 


deg fg =deg f+degg. 


This implies that the only polynomials in R[x] that have reciprocals in R[x] 
are the nonzero constants—polynomials of degree 0. 

And there’s more: Euclid’s lemma, properly formulated, holds in R[x]: 
there is a greatest common divisor for two polynomials (unique up to a unit), 
and this greatest common divisor is a linear combination of the two polyno- 
mials. 

This implies that you can compute the greatest common divisor of two 
polynomials with the same routine that you used in Z. For example: 


3 

2x2 -x-1]) 6x7+x-1 
2 1 1 
6x -3x-3 a) 
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The degree of a poly- 
nomial f is denoted by 
deg(f). For a complete 
development of arithmetic 
with polynomials with 
coefficients in arbitrary 
fields, see [19, Chapter 6]. 


This answers the question 
raised in a sidenote 

in Section 2.2. One 
significant difference 
between Z and R[x]: Z 
has two units and R[x] 
has infinitely many. 


66 


Chapter 3 The Fundamental Theorem of Arithmetic 


This says that 
gcd (2x7 -x-1,6x7 +x-1)=4x4+2. 


Hmm.... The high-school way to do this is to factor each polynomial into 
irreducibles and take the common factors: 


oy 2-1 (Cee iie=1); 
6x7 +x-1=(2x+1)(3x-1), 


so the gcd is 2x + 1, not 4x + 2. But recall that gcd is unique only up to unit 
factors, and 4x + 2 = 2(2x +1). 

This peskiness goes away if we use the “linear combination” way to char- 
acterize gcd that we saw in Lemma 3.4, because 


(2x? — x — 1)R[x] + (6x? + x - 1)R[x] 
= (4x + 2)R[x] = (2x + 1)R[x], 


as you can (and should) check. 

And there’s more. ... The two functions defined in Lookout Point 3.3 work 
for polynomials! That is, if f, g € R[x] and g(x) and r(x) are the (quotient 
and remainder) polynomials guaranteed by Lemma 3.10, define two func- 
tions s and t on R[x] by 


(he)=4" nie (3.2) 
s(F.8) = t(r,f)-qs(r,f) otherwise, 
and 
fi if f =0, 

(fg) = on otherwise. om) 

Then 
(s(f.8)-f) R[x] + (fg) -8) Rix] = ged(f,g) R[x]. 

Indeed, 


(s(f.8)-f) + (t(f,8) -8) 


is the output of Euclid’s algorithm applied to the pair (f, g) (Exercise 3.6). 


Lookout Point 3.4. By now, you are probably itching to implement the 
calculations and algorithms described in this chapter on a computer, and it 
is a worthwhile and satisfying adventure to build computational models of 
all this. If you are inclined to do it, one piece of advice: build your models 
in an environment that has formal expressions (polynomials, for example) as 
first-class objects. There are many such computer algebra systems, such as 
Mathematica and Wolfram alpha; some even exist on handheld calculators 


3.1 Getting Started 


(like the TI family). A detailed account of what we do in this section, com- 
plete with Mathematica code, can be found in [16]. 
Just as an example, you can use these algorithms to compute (by hand, 
even) the output of Euclid’s algorithm on the pair 
fer =e =5s" 48x34, 2G) = 32? 60 e972, 


You will get (we hope) 


Next, compute s(f, g) and t(f, g): 


9 9 
SU EVO Gee as 


10 = 20 
35 9 17 
t = 
(8) =~ 79" ~ 99° * 20 
And (applause) finally: 
( = + =) Ce x = 54" 48x -4) 
17 
+( a2 ae )ox 6x" +#=2) 
10 20 = 20 
7 7 
=-x--7. 
4 2 


There is an important application of these ideas, one that goes back to 
when C first reared its head in mathematics. The approach to complex num- 
bers taken by many Renaissance mathematicians (and the same approach 
taken by many high-school students) is, essentially, to consider complex num- 
bers as polynomials in i, where calculations are carried out as usual with the 
extra simplification rule i? = —1. This amounts to looking at polynomials in i 
and setting i7 + 1 = 0. 

This setting of something equal to 0 should look familiar. In constructing 
Zp from Z, we threw away multiples of p (that is, we set them equal to 0), 
and we saw that this idea is compatible with addition and multiplication. The 
ring Zp turned out to be a field, thanks in large part to Theorem 2.12. 

We can transport this idea to R[x] by exploiting its structural similarities 
with Z. More precisely, x? + 1 is irreducible in R[x], so it could play the role 
of the prime p in Z,. And it turns out that “reducing mod x? + 1” produces a 
field, a field that is abstractly identical to C (Exericse 3.8). 

And there is more: later, we will see that this construction (reducing mod- 
ulo an irreducible polynomial) is much more general, and we shall apply it to 
fields other than R. 


Exercises 


3.1 Prove Lemma 3.9. 
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The high-school method 
(looking for common 
factors) produces x — 2, 
right? 


Don’t take our word for 
it; dig in and calculate or 
write the program. 


This approach is some- 
times frowned upon by 
educators, but it contains 
the germ of a brilliant 
insight. 


For more on Renaissance 
approaches to C, check 
out [19, Chapter 3]. 


And this is exactly 
what our teenagers and 
Renaissance ancestors 
wanted. 
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For inspiration, you can 
consult [16], but don’t 
do that until you have 
played with this exercise 
for yourself. 


An integral domain is a 
commutative ring in which 
every product of nonzero 
elements is nonzero. 
What are some examples 
besides the Gaussian 
integers? Nonexamples? 
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3.2. Write gcd(216, 3162) as a linear combination of 216 and 3162. 


3.3 Find the remainder when each polynomial is divided by x? + 1: 


Gi) 53° = 32" 42x41 
Gi) x° = 39 420-4 
(iii) 5x3 — 3x2 +42x4+14x4-32x9742x-4 
(iv) (5x3 — 3x? + 2x + 1)(x4+- 3x3 +2x -4) 
(v) (5x3 -— 3x? + 2x 41)? + (x? + 1)(x4 - 3x3 + 2x -4) 
3.4 Find the greatest common divisor of each pair (f, g) in R[x] and write 
it as a linear combination of f and g: 
@) (x3 - 2x? —x-2,x9 - 3x? +3x -2) 
Gi). G®=1=1 
Giy Ge = 27 =9¢= 9 Ie? 49" 4 Op = 4) 
Gy (= 1a 4e"=2) 
(v) ((2x + 1)(x® - 1), (2x + 1)(x> - 1)) 
(ay Gx" =3.2 7 =2) 


3.5 Show that x” — 1 divides x” — 1 in R[x] if and only if m | n. 
3.6 Consider the functions (3.2) and (3.3) defined above. Show that 


(s(f,8)-f) + (t(f.8)-8) 
is the output of Euclid’s algorithm applied to the pair (f, g). 


3.7 Prove Lemma 3.10. 


3.8 Take It Further. Develop the theory of polynomials modulo x7 + 1 and 
show that the resulting ring is a field that is structurally identical to C. 
Show that every polynomial is congruent (modulo x + 1) to a linear 
polynomial a+ bx for a, b €« R. Make sure to show that nonzero elements 
have reciprocals, and while you’re at it, find a formula for the reciprocal 
of a+ bx. 


3.2 The Gaussian Integers 


Gauss developed the arithmetic of the integral domain of complex numbers 
of the form a + bi, where a and D are ordinary integers, in his memoir of 1828 
on biquadratic residues. He was interested in problems concerning the sub- 
group of fourth powers in Z,,, the field of integers modulo p, and was able 
to establish the remarkable fact that 2 is a fourth power in Z, for p = 1(4) 
if and only if p = a? + 6457. His treatment of the elementary properties of 
the above integral domain, now known as the Gaussian integers and denoted 
by Z[i], represents an important step in the early development of algebraic 
number theory. Although we won’t prove the result about 2, we will, with the 
aid of the simplest considerations in Z[i], be able to show that every prime p 
that exceeds by | a multiple of 4 is the sum of two squares. Thus we gener- 
alize the observations 13 = 9 +4, 41 = 25 + 16, and 2232037 = 17 + 14947. 


3.2 The Gaussian Integers 


Furthermore, if such a representation is unique, then the number is prime! In 
this way, the representation of 2232037 can be used to prove that it is prime 
(Euler). The result on two squares goes back to Fermat. In fact, we find a 
letter of June 15, 1641, from Fermat to Bernard Frénicle de Bessy (c. 1604— 
1674) beginning, “La proposition fondamentale des triangles rectangles est 
que tout nombre premier, qui surpasse de l’unité un multiple de 4, est com- 
posé de deux quarrés” [27, p. 221].! It was later discovered by Lagrange that 
every integer is the sum of four squares. The two-square result depends on the 
fact that —1 is a square in Z, only when p = | (mod 4), while the four-square 
result depends on the fact that —1 is always the sum of two squares in Z,,. Let 
us therefore begin with the arithmetic in Z[/]. 

If a € Z[i], recall that @ denotes the complex conjugate of a. The com- 
plex conjugate of an element of Z[i] is also in Z[i], and moreover, a@ is an 
ordinary integer in Z, called the norm of a and denoted by N(a). 

We observe that N(@) > 0, and N(q@) = Oif and only if a = 0. Furthermore, 
as we saw in Chapter 2, the norm mapping is multiplicative: 


N(a)N(B) = N(aB). 


An element u € Z[i] is called a unit if ut = 1 for some t € Z/i], that is, 
if uv has an inverse in Z[i]. In other words, the units in a ring are the ring’s 
invertible elements. If u is a unit, then N(u)N(t) = N(ut) = N(1) = 1. 
Since N(u) is a positive integer, we have N(u) = 1. If we put u = x + iy, 
then N(uw) = 1 says that x? + y* = 1. The integer solutions to this equation 
are x = 0, y = +1, and x = +1, y = 0, giving the four elements +i, +1. So 
the units form the vertices of a square in the complex plane and coincide 
with the cyclic group of fourth roots of unity. We see also that the units are 
precisely the elements of Z[i] of norm 1. Two Gaussian integers are said to 
be associates if one is a unit times the other. 

The really nice thing about the ring Z[i] is that one can prove a division 
algorithm. It goes as follows: 


Lemma 3.11 (Division algorithm in the Gaussian integers). Let a,8 € 
Zi], B # 0. Then there exist y,6 € Z[i] such that a = By + 6, where 0 < 
N(6) < N(B). 

Proof. If we write a/B as the complex number a + bi, then it is easily seen 
that a and b are rational numbers, but they won’t be integers in general. But 
we can find integers x and y such that |x — a| < 1/2 and |y - b| < 1/2. Form 
the element x + iy of Z[i] and try it as a candidate for y. That is, form B -y. 
And then put 6 = a — By. So we have 


a=PByt+o. 


So far, all we have done is rig things so that this equation is true. Now we 
would like to show that 0 < N(6) < N(f). 
To this end, we would like 


w(S-y) <1, 


'The fundamental proposition on right triangles is that every prime number that exceeds by 
one a multiple of 4 is composed of two squares. 
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What are all the associates 
of 3 + 2i? 


In much of the following, 
we are harassed by units 
dangling in front of our 
elements. Learn to deal 
with it. 


Where is x + yi in 
relation to a + bi in the 
complex plane? That is 
Exercise 3.11. 
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for then 
N(a- By) = N(6) < N(B). 


But 


w(S— 7) = was br- (x +i9)) =N(a= 9-+106-y)) 


1 1 1 

2 2 

=(a- +(b <-+-=-—<l. 
(d=) +(b=y) 4°42 


So Z[i] has a division algorithm. We should put it to work for us and derive 
a fundamental theorem of arithmetic for Z[i]. Instead of using the word prime 
again, let’s introduce the somewhat more general concept of irreducible to 
describe an element whose only divisors are itself and a unit times itself. 
Thus an irreducible element in Z[i] is an element a whose only divisors are 
units and units times a. For example, since the only divisors of 2 + 3i are 1, i, 
-i, -1, 2i- 3, 2 + 3i, -2i + 3, and —2 — 37 (Check this!), it follows that 2 + 3i 
is irreducible. 

In general, suppose a + bi has norm p a prime. Then a + bi is irreducible. 
Because if a + bi = af, where @ and £ are nonunits, then N(@) > 1 and 
N(B) > 1. But N(a + bi) = p = N(a@)N(f), so that is impossible. But an 
irreducible need not have a prime norm. For example, you can check that 7 
is irreducible in Z[i] (try it). But N(7) = 49. We will show later that the norm 
of an irreducible in Z[i] is either prime or the square of a prime. 


Lemma 3.12. /f @ is a nonunit in Z|i], then there exists an irreducible ele- 
ment x € Zi] such that x | a. 


Proof. We know that @ has nonunit divisors (a, for example). Let 2 be a 
nonunit divisor of smallest norm. If x were not irreducible, then z could be 
written as a product, 2 = 772, with N(z,) > 1 and N(z2) > 1 (because 7, 
m2 are nonunits). This would imply that N(71) < N(a). But 7 | @. a 


Similarly, we have a result analogous to Lemma 3.2 in Section 3.1. 


Lemma 3.13. Jf @ is a nonunit in Z[i], then a = 1\7273---1s, where the nj, 
i=1,...,s, are irreducible. 


Proof. The lemma is true for all elements of Z[i] of norm 2 (in fact, it is 
true for all elements whose norm is prime, as we showed above). If a@ is irre- 
ducible, then it is the product of one irreducible, and we are done. Suppose, 
This proof should feel then, that @ is not irreducible and has norm n, and assume for the induction 
senile hypothesis that the lemma holds for all elements of norm at least 2 and less 


3.2 The Gaussian Integers 


than n. Choose 7, | @, where 7 is irreducible, and write a = 78, and so 
N(q@) = N(mB) = N(m7)N(B), whence 2 < N(B) = N(a)/N(m71) < N(a), 
the first inequality holding because £ is not a unit. By the induction hypothe- 
sis, 8 = %°+-75, and SO @ = 11 %273-+-75, as desired. | 


Lemma 3.14. /f @ is a nonunit, then a =u- Ay -Hs™, where 1,...,s are 
irreducibles no two of which differ by a unit (i.e., mj # u- 71; for a unit u). 


Proof. Write a = 7 ---7; and collect terms. | 


Many of the results in Section 3.1 generalize to Z[i]. One important exam- 
ple is that there is an analogue of the greatest common divisor, inspired by 
Lemma 3.4, not as a Gaussian integer (units get in the way), but as a linear 
combination. That is, if @ and £ are in Z[i], we put 


(a, 8) = Zi] + BZ[i] (= {ext By| xy Z[i]}), 
and we use this as our gcd. 
Lemma 3.15. [fa and B are in Z[i], then there is 6 € Z[i| such that 
aZ{i| + BZ[i] = 6Z[i]. 
Furthermore, 6 has the following properties: 
(i) 6|aand6| B. 
(ii) If w| a and p| B, then | 6. 


Proof. If a = B = 0, then put 6 = 0. Otherwise, consider the set of all ax+ By, 
x and y ranging over Z[i]. Choose 6 in that set with smallest positive norm. 
If pis in the set, then by the division algorithm, we have 


f=sd+r, where0< N(r)<N(0). 


However, ps — s6 is of the form ax’ + By’. Sor is in a@Z{i] + 8B Z[i], which 
forces N(r) = 0. Therefore, r = 0. Thus 


6Zli] > aZ[i] + BZ[i]. 


You can establish the reverse inclusion (try it). 
Finally, a and £ are in 6Z[i], so 6 | a and 6 | £. If «| a and pw | £, then pu 
divides every ax + By. In particular, pr | 6. | 


Note that if 6Z[i] = 6’Z[i], then 6 | 6’ and 6’ | 6. Thus 6 and 6’are asso- 
ciates, that is, 6 = ud’, with wu a unit. 

The basic result corresponding to Euclid’s lemma (Theorem 3.5) is as fol- 
lows. 


Theorem 3.16 (Euclid’s lemma in the Gaussian integers). /f 7 is irre- 
ducible and x | a8, then | a or 7 | B. 
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Proof. Ifx + a, then z and a have only units as common factors. By Lemma 3.15, 
therefore, 


x Z[i] +a Zi] =u Z[i], 


where u is a unit. Now, uwZ[i] = Z[i]. Hence 1 = 2s + at for s and t from 
Zli]. Multiplying by £ gives B = zfs + aft. Thus x | x and x | af implies 
n | B. rT 


The fundamental theorem is next, and the proof is up to you (Exercise 3.12): 


Theorem 3.17 (The fundamental theorem of arithmetic for Gaussian 
integers). Every Gaussian integer can be written as a product of irreducibles 
in essentially one way. 


If you get stuck, see [19, A very nice way to spend (at least) a half hour is to carry through the results 
Chapter 8]. The ring Z[p] of this section for the integral domain Z[p], where p = (-1+iV3) /2 is one 
is often called the ring of ; fone 4 : be ool 
Eisenstein integers. of the two complex cube roots of unity satisfying 1+ + * = 0. Eisenstein did 
this and was led to cubic reciprocity. For these results and others, he received 


the praise of Gauss. 


Exercises 


3.9 Show that the distance between two complex numbers z and w in the 
complex plane is \/N(z- w). 


3.10 The division algorithm locates a quotient and remainder. Are they unique 
(i) in Z? 
(ii) in Z[i]? 

3.11 Using the notation of Lemma 3.11, where is x + yi in relation to a + bi 
in the complex plane? 


3.12 Prove Theorem 3.17. 


3.3 The Two Square Theorem 


In Section 3.2, you checked that 7 is an ordinary prime that is also an irre- 
ducible in Z[i], while 13 = (2+3i)(2—37) is not. We say that 7 is inert, while 
13 is said to split. Can one describe the set of all primes that are inert? When 
does a prime split? 

In order to answer these questions, we need a lemma from modular arith- 
metic. 


Lemma 3.18. [fp = 1 (mod 4), then -1 is a square in Zp. 


Proof. We did this already! Remember Theorem 2.21 (or more precisely, 
Lemma 2.20) in Section 2.3? But if something is worth proving once, it is 
worth proving twice. Here is a proof that has a bonus: 


3.3 The Two Square Theorem 


By Fermat’s little theorem, aP-!—1=Ofora=1,2,.. .,p — 1. Hence 


1) (x-2)-(x= (p 


p-l 


xPl_y=(x 1)) in Zp. 


Put x = 0 to obtain 


L={=1)(-2)(=(p 


1))=(p-1)! 


(Wilson’s theorem!) 


But 
-l=p-1, 
—2=p-2, 
See) ed 
2 2 
Hence 


Lookout Point 3.5. And if that’s not enough, here is another variation on a 
proof: Since 4 | p—1, the theorem of the primitive element gives an element a 
of order 4. Thus a* = 1, or (a? - 1) (a? + 1) = 0. Since a” + 1, we conclude 
that a? = -1, showing that —1 is a square. 


Now we can prove the famous two square theorem. 


Theorem 3.19. Jf p = 1 mod 4, then 
D=MTT, 
where x is irreducible in Zi]. 


Proof. Since -1 is a square modulo p, we see that a* = -1 in Zp for an 
ordinary integer a. Thus p | (a* +1) in Z. So p | (a +i)(a—i) in Z[i]. It 
follows that p is not irreducible in Z[i]. For otherwise, by Euclid’s lemma 
(Lemma 3.16), we would have p | (a —i) or p | (a +i). That would say that 


a l. a i, 
+ 


lL or l 
PP P Pp 


is in Z[i], which is absurd. So write p = a8 with N(@) > 1, N(B) > 1. On 
taking norms, we obtain 


p =N(a)N(B). 


Hence by unique factorization in Z, we have p = N(@) = a@. Since N(q) is 
prime, a is irreducible. | 
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You met Wilson’s theorem 
in Exercise 2.52 in 
Section 2.3. 


Some people think that 
Wilson’s theorem is 
named after the rock ’n’ 
roll legend Jackie Wilson. 


On which proof is this a 
variation? 


In fancy language, if 
p = 1 mod 4, then p splits 
in Z[i]. 
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We say that z lies over p. 


The proof is up to you 
(Exercise 3.13). 
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As a corollary we have the lovely result that every prime in the sequence 
{1,5, 9, 13, 17,21, 25,...} is expressible as the sum of two squares. 


Corollary 3.20. /f p = 1 mod 4 is prime, then p = a? + b* for integers a and 
binZ. 


Proof. Write x = a + bi in Theorem 3.19. a 


Using unique factorization in Zi], you can show that the above represen- 
tation is unique up to order and the signs of a and b (try it). 

It would be good also to know whether p = 3 mod 4 implies that p is inert 
(does not split) in Z[i], in other words, that p is not the sum of two squares. 
Indeed, if p = a? + b’, then exactly one of a” and b* would be even. Say a” 
is even and b” is odd. That implies that 4 divides a? and b’ is congruent to 1 
modulo 4. Thus a” + b? is congruent to 1 mod 4. Thus the inertial primes are 
precisely the primes congruent to 3 modulo 4. 

There is one remaining prime that we haven’t discussed. The even prime 2 
can be written as 


2=-i(1+i). 


The number 1 +7 is irreducible, since it has norm 2, and —i is a unit. Hence 
2 =u-x’. In algebraic number theory, 2 is called a ramified prime. 
To summarize all this: 


Theorem 3.21 (Law of decomposition in the Gaussian integers). Every 
rational prime p decomposes in Z{i| in one of three ways: 


(i) p splits into two conjugate prime factors if p = 1 mod 4. 
(ii) p is inert if p = 3 mod 4. 
(iii) p = 2 ramifies: 2 = -i(1 +i)?. 


Theorem 3.21 tells us how primes in Z behave when they are viewed as 
elements of Z[i]. And you can show that every irreducible in Z[i] divides a 
prime in Z, because every irreducible z divides its norm, 7 7. Using this, you 
can establish how primes in Z[i] behave. 


Corollary 3.22 (Classification of Gaussian irreducibles). The irreducibles 
n in Zi] are of three types: 


(i) x = a+ bi, and xn divides a prime p € Z such that p = 1 mod 4. In this 
case, N(m) =a? +b’. 
(ii) 2 = p, where p is a prime in Z such that p = 3 mod 4. In this case, 
N(p) = p*. 
(iii) a = 1 +i and its associates. In this case, N(1 +i) = 2. 


Figure 3.2 shows one way to visualize the story. 
One can consider integral domains other than Z[ Vv =I], for example, the 


domain Z| Vv -d ] for a fixed positive square-free integer. If d = —1 mod 4, you 


3.3 The Two Square Theorem 


(1 +i) (ramified) (2 +i)(2 -i) (split) 7 (inert) 


2 5 7 


Figure 3.2. Primes upstairs in Z[i] and primes downstairs in Z. 


need a slightly larger ring that allows 2 in the denominator. It turns out that the 
only such rings that have a Euclidean algorithm are those for d = 1, 2,3,7, 11. 
That the fundamental theorem of arithmetic fails for certain rings can be seen 
by considering the decomposition of 6 in the ring of integers z[V-5]. Then 
6=3-2= (1 + V5)(1 - V-5), and one proves quickly that 2,3, 1 + /—5, 
and 1 — \/5 are all irreducible in Z[V-5] and that they do not differ by a unit 
factor. Similarly, 9 = 3-3 = (1 + V10)(-1 + V/10) in Z[ 10]. 


3.3.1. Fermat’s Last Theorem 


Surely, one of the oldest and best-known problems in number theory involves 
the search for Pythagorean triples—triples of positive integers (a, b,c) that 
are side lengths of a right triangle, so that 


2: 2, 
a +b*=Cc’. 


Diophantus of Alexandria developed, around 250 CE, a geometric method for 
generating such triples. Stated in modern language, he realized that a rational 
point on the unit circle (the graph of x? + y* = 1), when written in the form 
(¢, B), produces a Pythagorean triple: 


2 2 
(<) -(2) =1 = @+bP=c’. 


c Cc 


One can get such a rational point by forming a line with positive rational 
slope through the point P = (—1,0) and intersecting the line with the circle. 
The second intersection point will then be rational (check this). Hence, it was 
known early on that there are infinitely many Pythagorean triples (details are 
in [19]). 

There are several algebraic methods for generating Pythagorean triples. 
One method builds on an old party trick: Ask each person at a party to pick 
a favorite Gaussian integer r + si (make r > s > 0) and square it. Watch eyes 
light up: 


(2+i)? =344i, 
(3 + 2i)* = 5 +12i, 
(5+ 2i)* = 21+ 20i, 
(5+4i)* =9+40i. 


The punchline: the square of a Gaussian integer seems to be of the form x+ yi, 
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A T-shirt that celebrates 
nonunique factorization. 


Compute the norms of 
each of these Gaussian 
integers for another nice 
punchline. 
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Fermat was not the first 
mathematician to write a 
marginal note in a copy 
of Diophantus. Next to 
the same problem, the 
Byzantine mathematician 


Maximus Planudes wrote, 


“Thy soul, Diophantus, 
be with Satan because 
of the difficulty of your 
theorems.” 
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where x and y are the legs of a Pythagorean triple. You should prove this using 
only high-school algebra. 

This is more than a party game, though. It expresses a property of Gaussian 
integers that you may have noticed: if z € Z[i], then the norm N(z) is a sum 
of the squares of two integers. So if we can find z such that N(z) is a perfect 
square, then you have a Pythagorean triple, right? 

Corollary 2.2 from Chapter 2 comes to the rescue: for every Gaussian inte- 
ger z, we have 


N(z*) = (N(z)) 


The right-hand side is the square of an integer, and the left-hand side (because 
it is a norm) is the sum of two squares of integers. Bingo. 
For example, suppose that z = 3 + 2i. Then N(z) = 13 and 2? = 5+ 12i. So 


(N(z))” = 13? and N(z?) =57 +127. 


This gives an easily programmable method for generating all the Pythagorean 
triples you will ever need. Try it. Table 3.1 gives just a small sample of the 
treasures that await: 


Table 3.1. (7 + si)? and the square of the resulting norm. 


s=l s=2 s=3 s=4 
r=2 34+41,5 
r=3 8 + 67, 10 5 + 12i, 13 
r=4) 15+8i,17  12+16i1,20 74+24i, 25 
r=5 | 24+10i,26 21+20i1,29 16+301,34 9+ 40i, 41 
r= 35+127,37 324+247,40 274+361,45 20+ 487, 52 
r=7)| 484+141,50 45+287,53 40+427,58 33+ 561, 65 
r=8 | 634+167,65 60+32i7,68 55+48i,73 48+ 64:7, 80 


We can see the same thing in a less fancy way: if you want integers a and b 
such that a? + b? is a perfect square, you might write the sum of those two 
squares as 


a’ +b? = (a+ bi)(a-— bi) 


and try to make each factor on the right-hand side the square of a Gaussian 
integer. And it is within the scope of high-school mathematics to show that if 
a+bi=(r+si), then a —bi = (r—si)*. You can finish the argument. 

About fourteen centuries after Diophantus, Fermat (1607?—1665) proved 
that there are no positive integers a, b,c such that a*+b* = c+. He was studying 
his copy of Diophantus’s Arithmetica, published in 1621, and he wrote in its 
margin: 


It is impossible for a cube to be written as a sum of two cubes or a fourth 
power to be written as a sum of two fourth powers or, in general, for any 
number that is a power greater than the second to be written as a sum 
of two like powers. I have discovered a truly marvelous demonstration 
of this proposition which this margin is too narrow to contain. 
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Fermat never returned to this problem (at least not publicly) except for his 
proof of the case n = 4. The statement that if n > 2, there are no positive 
integers a,b,c such that a” + b” = c” is called Fermat’s last theorem. The 
original text in which Fermat wrote his famous marginal note is lost today. 
Fermat’s son edited an edition of Diophantus, published in 1670, containing 
his father’s annotations, including his famous “last theorem.” It contained 
other unproved assertions as well, most true, some not. By the early 1800s, 
only Fermat’s conjecture about sums of powers remained undecided, whence 
the name “last theorem.” It became a famous problem, resisting the attempts 
of mathematicians of the highest order for 300 years. Most mathematicians 
believe that Fermat did not have a correct proof. The quest for a proof of 
Fermat’s last theorem generated much beautiful mathematics. In particular, it 
led to an understanding of complex numbers, factorization, and polynomials. 

One of the basic strategies for trying to prove Fermat’s last theorem in the 
seventeenth, eighteenth, and nineteenth centuries was the method we used 
for n = 2: show that a” + b” factors in Z[Z,], but this time, show that the 
factors cannot combine to produce a perfect nth power in that system. It is 
worth looking at the basic idea, because it shows the importance of unique 
factorization. 

If the equation x” + y” = z” has a solution in Z, then it has a solution for 
every prime factor of n, because if n = pq, then 


a” +b" =c" ==> (a)? + (bY)? = (c4)? . (3.4) 


Because Fermat proved the theorem for n = 4, equation (3.4) implies that it 
has no solution for n = 2” for every integer r > 2. So using equation (3.4) 
again, it follows that it is enough to show that there are no integer solutions 
to x” + y” = z” for odd prime exponents. 

Suppose, then, that p is an odd prime number. The goal is to show that 
there are no positive integer solutions to 


aP+bP=c?, (3.5) 


and again, the idea is to factor a? +b? in Z[Z,, ] and show that the factorization 
cannot contain the pth power of some prime. 
Using Exercise 2.11 in Section 2.1, we have 


a? + bP = (a+b)(a+éb) (a+b) (aee*"d) 


With excruciatingly technical calculations and arguments, mathematicians 
(Lamé may have been the first, around 1847) showed that it was impossible 
for at least one prime factor of a? + b? to show up at least p times on the 
right-hand side of this equation. It seemed as if the “Fermat conjecture” was 
settled. 

But there was a basic flaw in these arguments, a flaw that has its roots in 
school mathematics. As we said in the introduction to this chapter, elementary- 
school students build “factor trees ” for whole numbers, and it is more or less 
assumed that if two different children build two different trees by starting 
from, say, 4 x 3 and 6 x 2, they will end up with the same prime factors. This 
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In high school, we also 
assume the same property 
for C[x]. 
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unique factorization property is never questioned (and hardly ever comes up) 
in school mathematics. 

And the argument outlined above for a? + b? also assumes that elements 
in Z[¢, ] can (just as in Z) be factored into primes in only one way. For some 
values of p, this is true, and for such primes, Z[Z,] has the unique factor- 
ization property mentioned above. The arguments that use Gaussian integers 
are solid, because Z[i] has unique factorization. The first case in which the 
property fails is Z[ £>3] [66, p. 7], a fact established by Ernst Eduard Kummer 
while he was researching a different but related question. It was eventually 
shown that unique factorization fails in infinitely many cases. 

It was natural to think that just as in Z or in the polynomials of high school, 
the rings Z[Z, ] would have unique factorization, as evidenced by the num- 
ber of mathematicians in the seventeenth, eighteenth, and nineteenth cen- 
turies who assumed it. How could so many not have known, for example, 
that unique factorization failed in Z[ £53]? It may seem that 23 is not all that 
large, but the calculations in Z[ £3] are hefty, even with computers. Imagine 
the stamina required to calculate by hand in this ring (Some of Kummer’s 
tour-de-force calculations are recounted in [24, Chapter 4]). So the assump- 
tion that Z[Z, ] is a unique factorization domain was widespread, and once it 
occurred to Kummer and others that this might not hold for every p, the proof 
that unique factorization fails in Z[&3] was, quite simply, very hard (again, 
see [24, Chapter 4]). 

We shall leave the story here, only to note that Kummer went on to prove 
Fermat’s last theorem in the case that Z[Z,] has unique factorization. And 
using an idea that is already present when children argue that the prime fac- 
torization of 4 x3 is the same as that of 6 x 2—there are primes “behind” each 
of the composite factors that are recombined in different ways—Kummer also 
developed a theory that would restore a kind of unique factorization, proving 
the theorem for a much wider class of primes [19]. But a complete proof had 
to wait until the mid-1990s, when Andrew Wiles, using sophisticated meth- 
ods developed in the twentieth century, was finally able to prove Fermat’s last 
theorem in full generality. 


Exercises 


3.13 Prove Corollary 3.22. 


3.14 Show that there are no integers x, y, z with 3 + xyz such that x7 + y? = 
z3 mod 9. 


3.15 Show that there are no integers x, y,z with 5 + xyz such that x° + y° = 
z°> mod 25. This exercise implies Fermat’s last theorem for exponent 5 
in the case that 5 + xyz. 


3.16 Are there any integers x, y, z with 7 + xyz such that x’+y’ = z’ mod 49? 
3.17 (i) Sketch the graph of x7 + y? = 1. 


(ii) Show that the only rational points on the graph are (1,0) and (0, 1). 


3.18 Take It Further. Let G be the graph of x° + y> =9. 
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(i) Sketch G. 
(ii) Find the equation of the line € tangent to G at (2, 1). 
(iii) Find the intersection of € and G. 
(iv) Show that there are infinitely many triples of integers (x, y, z) such that 


3.4 Formal Dirichlet Series and the Number 
of Representations of an Integer as the 
Sum of Two Squares 


We saw in the previous section that every prime congruent to 1 modulo 4 is 
the sum of two squares. In fact, if p = 1 mod 4, then p = x7 for x € Z[i]. 
Furthermore, since z is irreducible (N(z) = p), it follows that if = a + bi, 
then p = a’ + b* is the only representation up to plus or minus signs and 
permuting a and b. For z irreducible means that the only factorizations of p 
in Z[i] are p = (xu) (7u'), where u and wv’ are units. But the only units in Z[i] 
are 1,—1,i,—-i, which leads to the representations 


p=a’+b? = (-a)* + (-b)* = a? + (-b)* = (-a)? +B. 


Four other trivial changes (interchanging a and b) arise from 7 and its 
associates. We say that p has eight representations as a sum of two squares. 

If we consider an integer that is not prime, then the situation is quite dif- 
ferent. For example, 


65 =49+ 16=1+64. 
The different possibilities come from 
65 =5-13 = (2+i)(2-1)(2 + 3i)(2 -3i). 


One grouping gives (7 + 4i)(7 — 47), and another gives (1 + 8/)(1 - 87), as 
illustrated in Figure 3.3. 


Figure 3.3. 65 obtained in two different ways as the sum of two squares. 


For a general integer n, the various regroupings into conjugate pairs of 
terms of the complete factorization of n in Z[i] lead to a messy counting argu- 
ment. However, the final result is hardly wanting in mathematical elegance: 
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Yes, there are computer 
algebra systems that 

will do this for you, 

but (as Glenn Stevens 
always says), don’t let the 
computer have all the fun. 
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Theorem 3.23. The number of representations of a positive integer n as the 
sum of two squares in Z is four times the excess of the number of divisors of 
the form 4t + | over those of the form 4t — 1. 


Let’s consider an example. Take n = 18. Its divisors are 1,2,3,6,9, 18. 
Two of these are congruent to | modulo 4 (1 and 9). There is one, namely 3, 
that is congruent to —1 modulo 4. Thus the excess is 1, and so there are four 
representations. They are (+3)? + (+3)*. On the other hand, take n = 12. Its 
divisors are 1,2, 3, 4,6, 12. The only one congruent to 1 mod 4 is 1, while 3 is 
the only divisor congruent to —1 modulo 4. The excess is 0, so there are no 
representations of 12 as the sum of two squares, a fact that is quickly checked. 
It seems remarkable that the number of divisors of the form 4t—1 should never 
exceed the number of the form 4f + 1. Can you prove this without knowing 
that it is the number of representations of n as a sum of two squares? 

It is worth spending a half hour to compute more examples and to tabulate 
the results for n between, say, 1 and 50. You could also check, just for fun, 
that 15625 can be written as a sum of two squares in 28 ways, and 815730721 
can be so written in a whopping 36 ways. 

This result can be formulated nicely by introducing the function y. We 
define y(n) = 0 if n is even. If n = 4k +1, put y(n) = 1, and ifn = 4k - 1, put 
x(n) = -1. If you think about it, the sum 


>» x(d) 


d>0, d|n 


measures the difference between the number of positive divisors of n of the 
form 4k + | and those of form 4k — 1, right? 

So, letting r(n) be the total number of representations of n as a sum of two 
squares, our result can be stated in the following theorem. 


Theorem 3.24. The number of representations of a positive integer n as a 
sum of two squares is given by 


r(n) =4( x(a), 


d|n 
where the sum is over the positive divisors of n. 
There are several ways to prove this. One of the prettiest uses a piece of 
equipment that finds applications all over number theory. Here we go.... 
3.4.1. Formal Dirichlet Series 


There is a formalism going back to Euler that makes it possible to prove 
Theorem 3.24 in a particularly elegant manner. A formal Dirichlet series is a 
creature of the form 


yo Su(l4 ae) A ut sae 


where the a(n) are complex numbers. 
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The word “formal” is important here—we think of these series as book- 
keeping devices keeping track of combinatorial or numerical data. We don’t 
worry about questions of convergence; we think of s simply as an indetermi- 
nate rather than as a variable that can be replaced by a real or complex num- 
ber. This misses many of the wonderful analytic applications of such series, 
but it turns out that their formal algebraic properties are all we need for this 
discussion. 

Dirichlet series are added and multiplied formally. Addition is done term 
by term: 


ee ri > Pea) 7 > ate) +) . 


Multiplication is also done term by term, but then one gathers up all terms 
with the same denominator. For example, if we are looking for c(12)/12° in 


ee ee = Se 


then a denominator of 12° could come only from the products 


a(1) (12) a(2)_B(6) —a(3)_ (4) 


’ 


’ ’ 


1s 125 Qs 65 35 As 
a(4) b(3) a(6) b(2)  a(12) d(1) 
4s 35 ; 6° pias 125 js - 


In general, the coefficient c(n) above is given by 
n 
c(n) = > a(d)- (=) ; 
d\n d 


where again, )’, means that the sum is over the positive divisors of n. 
The simplest Dirichlet series is the Riemann zeta function: 


Then the above expression for c(n) implies that if 


then 
c(n) = Yo a(d). 
d\n 


Let us state this as a theorem. 


Theorem 3.25. 


as) DO. 


n 


where c(n) = Yia\n a(d). 
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Actually, the zeta function 
usually means the function 
of a complex variable s 
(introduced by Riemann) 
that analytically continues 
this infinite series. It is 
the object of much current 
research. Google, for 
example, the Riemann 
hypothesis. 
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A function is said to 

be multiplicative if 
a(mn) = a(m)a(n) 
whenever gcd(m, n) = 1. 
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Theorem 3.25 gives us a key corollary that we will need very soon: 


Corollary 3.26. [f a(n) is defined by 


then 


a(n) = >) x(d). 


d\n 


So, a(n) is the excess of the number of positive divisors of n of the form 
4k + 1 over the number of divisors of n of the form 4k — 1. Bingo: this is 
exactly the function that is the heart of Theorem 3.24. The idea, then, is to 
form the Dirichlet series with coefficients r(n) and show that 


yD a4e(s) A 


ns 


The 4 out front is there because we want to include the different ways to 

write a? + b* (switching a and b and sign changes). To keep things simple, let 

us use the function r; instead, where r;(7) is the number of representations 

x? + y? =n, x >0, y > 0. When we are done, we will simply multiply by 4. 
Our (new) goal, then, is to show that 


To do this, we will convert each of the sums to a product. For that, we need 
a (yet another) new idea: a function a defined on nonnegative integers is 
strongly multiplicative if for all nonnegative integers m,n, we have 


a(mn) = a(m)a(n). 


Examples of strongly multiplicative functions include the constant function 
that assigns | to every number and (check this) our function y. Can you think 
of some others? 

When a is strongly multiplicative, the Dirichlet series with coefficients 
a(n) has an alternative form that shows its connection with arithmetic. 


Theorem 3.27. Ifa is a strongly multiplicative function, then 
. a(n) | 11 1 
1-—a(p)/p° 


where the product is over all prime numbers p. 


Lookout Point 3.6. Wait! What does “product is over all prime numbers” 
mean? Here is what we need: Consider the formal product 
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To write this as a Dirichlet series 17°, b(n) , fix n and factor n into prime 
powers: 


Next, look at the finite product 


(scm) 


Each factor can be expanded as a geometric series: 


w1+(2@d),, (ate0)', (an). 
1 -a(pi)/P; Pi Pi Pi 


-1+ (22). (G2), (9), 
Pj Pi Pi 


Then look at the coefficients of 


a(p}) —a(p5) a(p?) 
ays ? ays? ea ars ? 
Pi P2 Pt 


and multiply them together. That’s b(n). 


Back at the ranch, we want to show that 


re ora 


1—a(p)/p’ 


where a is our strongly multiplicative function. 


Proof. As above, each factor on the right-hand side is a geometric series: 


1 fa(p)) , (av) (2) 
1 -a(p)/p* t+ iad }+( P “Vp )* 
x14 (22), (2), (22)... 
ps ps ps 


Now multiply the expressions for atpy pF together (one for each prime). 
You get the sum of every possible expression of the form 


a(p\' )a(py’)--a(pe") — a(pi'Py-Pr”) 
e|S _ e9S rs ( ere. 


P, P2 “Pr P, Py ~pery 


Since every n € Z can be written in one and only one way as a product of 
powers of primes (the fundamental theorem of arithmetic again), this is the 
same as the sum 


ye 


an 


n=1 
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Where does Q, sit in the 
complex plane? 


This is best understood 
by calculating a few 
coefficients by hand. 
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We shall use two applications of Theorem 3.27: 


(i) The constant function a(n) = 1 is strongly multiplicative, so the Rie- 
mann zeta function has a product expansion 


Sl 1 
= = ; 3.6 
f(s) dis ap (3.6) 


(ii) Our favorite function y is also strongly multiplicative, so 


7 “Ulsnre): 


At last, we are ready to derive a formula for r;(7), the number of repre- 
sentations of n in the form x? + y, where x > 0, y > 0. Consider the formal 


Dirichlet series 


Here’s the clever idea: each term in the sum is itself a sum of fractions 
with numerator 1, and the number of such fractions is the number of Gaussian 
integers with given norm. For example, r;(25) = 3, because 25 can be written 
as a sum of squares in Q; := {x + yi|x,y¢€Z, x >0, y > 0} in three ways, 


97 4.47. 4? 4.37, 5? 07, 


so 3/25* comes from 
1 1 1 
+ + ; 
N(3+4i) N(4+3i) N(5+0i) 
Using this idea of representing a sum of two squares as a norm from Z{i], 


using the multiplicativity of N, and letting Q; denote the first quadrant as we 
have defined it, we get a product formula for the left-hand side: 


— ri(m) _ 1 
2, neo, (N(@))” 


n=1 


= 1 
= 7x5 (use the fundamental theorem in Z[i]) 


HS ((w(n))*)° 


1 
= II 1-1/N(x)5 


mEQ) 
Here the product is over all Gaussian irreducibles in the first quadrant. 
Now, for convenience, let us use P as a shorthand for °°°, r\(n)/n*, so 


that 


(sum a geometric series) . 


1 
“Ur 1/N(x)5 


mEQ 


We will now pick P apart, looking at the each factor. Here we go.... 
If z is an irreducible in Z[i], then N(z) = p for some prime p in Z. And 
by the law of decomposition in Z[i] (Theorem 3.21), there are three kinds of 


primes p: 
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(i) If p = 1 mod 4, then p splits into two conjugate prime factors: p = 77 
for an irreducible z. This contributes two identical terms to P: 
1 1 


i-in@y 9 TaN@” 


both equal to ap , and hence P contains a term 


1 2 
(sa) 


one for each prime p Z, p = 1 mod 4. 

(ii) If p = 3 mod 4, then p is inert, so it is irreducible in Z[i] and we can 
move p upstairs to Z[i] as itself. Hence p = p and N(p) = p’. So P 
contains a term 


1 
1-1 / ps 2 
one for each prime p in Z, p = 3 mod 4. 


(iii) If p = 2, it ramifies: -i(1 + i)”, and N(p) = 2. Hence P contains exactly 
one term for the prime p = 2: 


1 
1-1/2" 
And now, put it all together into one lovely algebraic calculation. Enjoy: It is a very good idea to 


give reasons for each step 
in this calculation. 


a nw) 7 
dow A u() way) 


2 
1 1 1 
*T=1/2 Ove 1- a] Gus Fe va 


2 
1 ll 1 Il 1 ll 1 
is 1/25 & mod 4 1- a] @ mod 4 I= a & mod 4 I+ a 


1 1 1 1 
~ 1-1/28 (11, 1- a ( 1, 1- a] ( I, 1+ a 


1 1 
ay (_.H,, ne} (J, : Sod 


1 
ay ol i ang) 


(9) 0 
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And finally, invoking Corollary 3.26, we have 
ri(n) = Di x(d). 


d\n 


That’s the punchline (applause)—we have proved Theorem 3.24, which gives 
the number of representations of an integer as a sum of two squares. 


We have seen that the classification of the irreducible elements in Z[i] 
using Dirichlet series gives a remarkably simple proof of the representation 
formula of an integer as the sum of two squares. For a proof not using Dirich- 
let series, see, for example, Hardy and Wright [35] or Niven and Zucker- 
man [61]. Both texts are excellent introductions to elementary number theory. 


Exercises 


3.19 If a and £ are two strongly multiplicative functions, show that the func- 
tion y defined by 


y(n) = Yi a(d)B 


n 
d\n d 
is also strongly multiplicative. 


3.5 Supplement: Hilbert’s 17th Problem 


At the International Congress of Mathematicians held in Paris in 1900, Pro- 
fessor David Hilbert delivered an address entitled “Mathematical Problems.” 
Here is the opening statement of his address: 


Wer von uns wiirde nicht gern den Schleier liiften, unter dem die Zukunft 
verborgen liegt, um einen Blick zu werfern auf die bevorstehenden 
Fortschritte unserer Wissenschaft und in die Geheimnisse ihrer Entwick- 
lung wahrend der kiinftigen Jahrhunderte! 


And here is a translation: 


Who among us would not gladly lift the veil behind which the future 
lies hidden to cast a glance at the coming advances of our science and 
the secrets of its development in future centuries? 


Hilbert, then still in his thirties and recognized by many as the greatest liv- 
ing mathematician, then proceeded to discuss twenty-three problems that he 
considered the most pressing at that time. The problems range over analysis, 
geometry, topology, number theory, and algebra. The seventeenth problem is 
easy to state. 

Consider the rational numbers Q and the field of rational functions 


Q(x1,---,Xn) 


3.5 Supplement: Hilbert’s 17th Problem 


in n variables. Recall that this means that an element of Q(x), ...,X,) looks 
like 

f (%1,-..-5 Xn) 

2(X1,---,Xn) 
where f and g are polynomials in x;,...,x, with coefficients in Q. Call such 


a rational function definite if 


f(a1,..-,Qn) ‘G 

g(Q1,...,An) 
whenever a,...,@, are n rational numbers such that g(a1,...,@,) # 0. 
Hilbert asked whether every definite element in Q(x1,...,x,) can be rep- 


resented as the sum of a finite number of squares of rational functions. This 
seemingly simple question had to wait until 1926 for a solution, when Emil 
Artin answered the question in the affirmative. His solution follows from his 
work with Otto Schreier on formally real fields. Hilbert himself had already 
shown that every definite polynomial in two variables can be represented as 
the sum of four squares. The question of how many squares would be needed 
for n variables remained unsettled. Then in 1966, James Ax conjectured that 
2” squares should do it, and his conjecture was proved by Albrecht Pfister in 
1967. 


Lookout Point 3.7. One can seek, however, more quantitative information 
by asking, for a given field F, the smallest number ¢(F’) such that if a € F 
can be represented as a sum of squares, then it can be done with t( F’) squares. 
Pfister’s theorem implies that ¢(R) < 2”. It is also known that t(R) >n+ 1, 
because J. W. S. Cassells proved in 1964 that 1 +x7 +++-+x? cannot be written 
as a sum of n squares in R(x,..., Xn). Hence 


n+1<t(R) <2". 


But nobody knows what the actual value of t(R) is! Cassels, William Ellison, 
and Pfister showed in 1971 that in the case of two variables, the answer is 4. In 
other words, every rational function that is a sum of squares can be expressed 
as the sum of at most four squares, and furthermore, three won’t do for at 
least one definite function. Indeed, they exhibited the function 


f(xy) =1 +x (x7 -3) y? +x7y4 


and showed that although f is definite, it is not the sum of three squares of 
rational functions in R(x, y). Their proof is very difficult and uses a lot of 
fancy algebraic geometry. 


As you can see, the problem is still very much alive. The solution to Hilbert’s 
original question only led to more problems. Such is the nature of mathemat- 
ics. As Hilbert said, “Moreover, a mathematical problem should be difficult in 
order to entice us, yet not completely inaccessible, lest it mock our efforts.” 
And in another place, Hilbert reminds us, “As long as a branch of science 
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offers an abundance of problems, so long is it alive; a lack of problems fore- 
shadows extinction or the cessation of independent development.” 

In 1930, Hilbert wrote a paper, “Naturerkennen und Logik,” which he 
delivered at the Kongress der Gesellschaft Deutscher Naturforscher und Artze, 
which ends with these words: 


Wir miissen wissen. 


Wir werden wissen.” 


?Naturerkennen und Logik = logic and the understanding of nature; Kongress der Gesellschaft 
Deutscher Naturforscher und Artze = conference of the Association of German Naturalists and 
Physicians; Wir miissen wissen, Wir werden wissen = we must know; we shall know. 


®) 


Check for 
updates 


The Fundamental Theorem of 


Algebra 


The field C of complex numbers has the remarkable property that every non- 
constant polynomial with coefficients in that field has a root in that field. An 
arbitrary field F with this property is said to be algebraically closed. 

That C is algebraically closed is not an obvious fact, and there are numer- 
ous proofs from various points of view. The proofs are roughly grouped into 
those that exploit the algebraic aspect of the theorem and those that use 
mainly topology and analysis. 

The fundamental theorem guarantees the existence of roots of polynomial 
equations, but it doesn’t provide methods for finding roots—that is another 
whole can of worms. There are classes of equations, such as quadratic equa- 
tions and cyclotomic equations, for which solution algorithms exist, but there 
are no general methods that apply to all polynomial equations. In this chapter, 
we will be (very) happy just establishing the existence of solutions. 


4.1 Getting Started 


First, note that it is enough to establish the fundamental theorem of algebra for 
polynomials with real coefficients. For if f has complex coefficients, then ff 
has real coefficients, where f is the polynomial obtained from f by replacing 
each coefficient by its complex conjugate (this is Exercise 4.1). Then a root 
a of f(x) f(x) will satisfy either f(a) = 0 or f(a) = 0. In the latter case, 
taking conjugates gives f(@) = 0. 

The theorem then says that the field obtained from R by adjoining a root of 
x? + 1 = 0 is algebraically closed. It seems rather amazing, viewed abstractly, 
that one can arrive at an algebraically closed field from a given field by adjoin- 
ing a root of a single polynomial. A beautiful theorem of Emil Artin and Otto 
Schreier states that if E > F, where E is any algebraically closed field that 
is a finite-dimensional vector space over F, then if F + FE, one can conclude 
that E = F(i), where i satisfies x7 + 1 = 0. 

It is easy to find fields that are not algebraically closed: 


(i) No subfield of R (Q, for example) is algebraically closed (why?). 


(ii) Consider the simplest field Z2, of two elements. Then x? +x +1 has no 
root in Zo. 


(iii) If Z,, is the field with p elements, where p is prime, then x? — x +1 has no 
root in Z, by Fermat’s little theorem. 
© Springer Nature Switzerland AG 2023 89 
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(iv) A formally real field—a field in which —1 is not the sum of squares, will 


It is even easier than that. not be algebraically closed. 


If the elements of F are 


dj, ..., dn, then (x - (v) In the same way, no finite field can be algebraically closed. For let F be 
a)(x-ag)---(x-an) +1 finite. Then F* is a finite group with, say, n — 1 elements. By elementary 
has no root. 


group theory, x"! = 1 for all x + 0. Hence x” — x + 1 has no root in F. 


(vi) More generally, if F is an ordered field such that a7 > 0 for a + 0, then F 
will not be algebraically closed. 


Don’t get the impression that C is the end of the road. For let t be an inde- 
terminate and consider the field C(t) of rational functions in t with complex 
coefficients. Since x” — f has no root in C(t), we see that C(r) is not alge- 
braically closed. You should write out a proof of this. 
We will need a more abstract (and hence more general) description of the 
real numbers. The set of real numbers R can be described axiomatically as a 
See Section 3.1 of complete Archimedean ordered field. It is constructed from Q by a completion 
Chapt a toe wital we process: First, one puts the usual absolute value |r| on Q by means of the order 
mean by Archimedean. 
relation in Q, namely, |r| = r if r > 0 and |r| = -r if r < 0. Using this absolute 


_ conan tines i value, one defines Cauchy sequences. Two Cauchy sequences {a,,} and {by} 
es, there are others. Kear 3 . oe : 
On. are called equivalent if ay, — by has limit zero as n > oo. Then equivalence 


classes of Cauchy sequences form a field that contains an isomorphic (struc- 
A Cauchy sequence (ay ) . ; : 
is one whose terms canbe _turally identical) copy of Q and is complete (Cauchy sequences converge to 
made as close together as —_q Jimit in the field), ordered, and Archimedean (if a > 0, thena+a-+--: gets 
POT SDE IMIER. aa big as you like). The order relation plays a very important role in this con- 
enough. 

struction. The complete Archimedean order on R already gives substantial 

information about the roots of polynomials. 

For example, one of the main theorems in elementary analysis is the inter- 
mediate value theorem, a result first proved by Bernard Bolzano in 1817. It 
implies that if a continuous real-valued function defined on [a, b] (the closed 
interval from a to J) satisfies the conditions f(a) < 0 and f(b) > 0, then 
there is a real number € with f(€) = 0. In particular, if f is a monic poly- 
nomial in R[x] of odd degree, convince yourself that f(x) < 0 for x very 
negative and f(x) > 0 for x very positive. It follows that f(x) has a real root. 
Hence to prove the fundamental theorem of algebra, which is the historical 
name for the fact that C is algebraically closed, one has “only” to deal with 
polynomials of even degree. 


Lookout Point 4.1. Another way of stating the result about polynomials 
of odd degree goes like this: there is no field E containing the field of real 
numbers R such that E is a finite-dimensional vector space over R of odd 
degree greater than |. 

To see this, suppose that E is a finite-dimensional vector space of odd 
degree over R. Take an element a ¢€ E. Since E is finite-dimensional over R, 
we see that for some n, the powers 1, a, a’,...,@” are R-linearly dependent. 
Thus aq is the root of a polynomial. Let f be the monic polynomial of minimal 
degree m that has a@ as a root. By Theorem 2.7 in Section 2.2, we see that 
R(q) is a field that is a vector space of dimension m over R. Thus we have a 
tower: 


4.1 Getting Started 


E 


| 
R(a) 
| 

R 


Then by Exercise 2.38 in the same section, it follows that m divides the 
degree of E over R, which forces m to be odd. But the intermediate value 
theorem then shows that f(x) has a root in R, which shows that f(x), an 
irreducible polynomial, must be x — a. Thus @ € R. Since @ was arbitrary, 
we conclude that E c R, and so E = R. This observation will be useful to 
us later, when we see how the fundamental theorem of Galois theory gives a 
very short proof that C is algebraically closed. 


Incidentally, there are other ways to put an absolute value on Q. First we 
must decide what an “absolute value” is. It is a real-valued function |-| on Q 
such that 


x| > 0, and |x| = O1f and only 1f x = Q, 
0, and 0 if and only if 0 

Ixy = |x| |y 
|x + y| < |x| +|y]. 


ry 


To find other absolute values on Q, fix a prime p. If n € Z,n + 0, write 
n = p“m, p + m. Define |n|, = p~“. Thus n is “small” if a high power of p 
divides it. If a/b € Q, b# 0, put 


a 


b 


= la|p 
p |d|p 


a 


Show that this doesn’t depend on the way ¢ is written (Euclid!), and 
check that |a + Bl, < |a|p + |B|p. This is enough to put the Cauchy machine 
into action: you can now form sequences and equivalence classes and finally 
arrive at a field, denoted by Q,, that contains Q and is complete; but it is 
not Archimedean, because |n|, < 1 for all integers n! (that is an exclamation 
point, not a factorial symbol). 


Lookout Point 4.2. Thus one can construct infinitely many new fields 
Q, Q3, Qs, Q7, Qu,..-, arising from different absolute values on Q. The 
number-theoretic mystique requires that one write Q.. in place of R. From 
this point of view, Qo. occupies a very special position, for unlike Qo.o, the 
field Q, for p # oo never has the property that Q,(s), where s is a sym- 
bol satisfying s = —1, is algebraically closed! In fact, one can show that if 
p = 1 mod 4, then —1 is actually a square in Q,,. Furthermore, there are irre- 
ducible polynomials of arbitrarily high degree over Q,,. For example x+p 
has no root in Q,, in contrast to the case for Q... These observations are not 
deep, but they require a little better understanding of Q, than is offered here. 

Thus it is a remarkable fact of nature that one can get from R to an alge- 
braically closed field with so little effort. In general, one has to be content 
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with an abstract construction. Indeed, there is a basic result established in 
every commutative algebra course that asserts that for every field F, one can 
construct a field F that contains F and is algebraically closed. Furthermore, 
every element in EF is algebraic over F. This last condition does not follow 
from the fact that FE is algebraically closed. For example, we shall see in 
Chapter 6 that z is not algebraic over Q. The field of all complex numbers 
that are algebraic over Q is algebraically closed. Can you prove this, assum- 
ing that C is algebraically closed? It is a good exercise (and it is Exercise 4.4). 


Gauss found a proof that C is algebraically closed in 1797 and published 
it as his PhD thesis in 1799. During his lifetime, Gauss discovered a num- 
ber of proofs of the theorem and published four in all. They appeared in 
1799, 1816 (two proofs), and 1850. In the following, we will give a num- 
ber of proofs. After giving a particularly simple proof that relies on a few 
basic facts from elementary analysis, we examine other proofs that involve 
more background. One is based on an idea of Gauss and uses the theorem on 
symmetric functions. Except for the existence of an abstract field containing 
the roots of a polynomial over a given field, this proof is completely elemen- 
tary. On the other hand, we will show how a very simple proof (Artin, 1926) 
can be derived from the basic result of Galois theory. The field C is a very 
special field, and it is perfectly reasonable to ask whether one can verify that 
it is algebraically closed by applying general algebraic theorems. But alge- 
bra alone will not do. We will discuss this later in this chapter after we work 
through several proofs of the theorem. 


Take It Further 


On the other hand, anyone who has had a basic course in complex analy- 
sis is familiar with proofs using complex integration theory. Polynomials are 
entire functions (holomorphic in the whole complex plane), and if a polyno- 
mial f has no zeros, then it is bounded away from zero, and therefore 1/f 
is a bounded entire function. Hence by Liouville’s theorem, it is constant. 
Another complex-analytic proof that exploits the topological character of the 
mapping f(x) is the following. A basic result in complex analysis is that 
holomorphic functions (analytic, regular) are open. That is, they map open 
sets onto open sets. For polynomials, this turns out to be equivalent to the fact 
that C is algebraically closed. 

In the next section, we will begin a completely rigorous proof of the theo- 
rem. But first, some exercises: 


Exercises 


4.1 Show that if f ¢ C[x], then ff has real coefficients, where f is the 
polynomial obtained from f by replacing each coefficient by its complex 
conjugate 

4.2 Prove the claim made in this section that the equivalence classes of 
Cauchy sequences form a field that contains an isomorphic copy of Q. 


4.2 Background from Elementary Analysis 


4.3 Referring to the definition of | - |,, in this section, prove a strong triangle 
inequality: If a and £ are integers, then 


la + Blp <max {|a|p,|Blp} - 


Also, show that equality holds if |a|, + |B|p. 


4.4 Take It Further. Assuming that C is algebraically closed, show that the 
field of all complex numbers that are algebraic over Q is algebraically 
closed. 


4.5 Prove Corollary 4.3. 


4.2 Background from Elementary Analysis 


Consider R x R as the Euclidean plane with its usual metric. The distance 
function allows one to introduce the notions of boundedness and closedness 
for subsets. We need to use the fact that R x R is complete as a metric space. 
In other words, if &, &,... are in Rx R and if léi - é,|| — Oasi, j > co (more 
precisely, if given € > 0, there exists an integer N such that li - &} | < € for 
i,j > N), then there is a point € € RxR such that |é - &;|| > 0. Aset AC RxR 
is said to be closed if A contains all its limit points. In other words, if & > &, 
& € A, then & € A. If A is closed and bounded, then we say that A is compact. 


Lemma 4.1. Let f be a real-valued continuous mapping defined on a com- 
pact set AinR*xR. Then f is bounded on A. That is, there exists N such that 
|f(x)|< N forall x€ A. 


Proof. Since A is compact, put a big square S around it. Divide the square 
into four congruent squares. If f is unbounded on A, then it is unbounded on 
a part of A inside one of the smaller squares. Pick one of those squares and 
call it S}. Take €, € SNA with | f(€,)| > 2. Divide S, into four squares. Since 
f(x) is unbounded in Sj, it is unbounded in one of the new smaller squares, 
say S2. Take & € SN A with |f(é)| > 4. Continuing in this way, we get a 
sequence {€;} in which f(€;) > 2! for each i. 

Consider ()72, S; = {€}, which must be a single point. Since A is closed, 
we have & € A. Now, &; € S$; A implies that €; > &. Furthermore, since f is 
continuous, it follows that f(€;) > f(é). But then for every positive integer 
i, we must have 


2 < IF) < IF(E) - FG) + IF()- 


Since | f (€) — f (&)| goes to zero, we have a ridiculous situation. Since math- 
ematics is not ridiculous, we have proved the lemma. a 


Lemma 4.2. Let f be a real-valued continuous function defined on a com- 
pact set A in Rx R. Then there exists € € A such that f(x) < f(€) forall 
xeA. 


Proof. By the preceding lemma, the set of real numbers f(A) is a bounded 
set. By a fundamental property of R, there is a least upper bound (“maxi- 


Here ||z|| denotes the 
distance from z to the 
origin. 
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mum’) tT to f(A). Recall that this t is characterized by the following proper- 
ties: 


(i) t>a@fora@e f(A). 
(ii) Given € > 0, there exists 8 € f(A) such that B >t -€. 


Choose a,7; € f(A) such that @ > 7; > tT - 1/2'. Then 7; = f(&) for some 
é;, and 7; > Tt. Now, {&;} is a bounded sequence of points in A, and a bisec- 
tion argument shows that {€;} has a convergent subsequence. Call this sub- 
sequence &,,. Then €, > p, and since A is closed, we see that py € A. Since 
f is continuous, we conclude that f(&,) > f(y). But f(&,) > 7, and so 
f() > t. Thus 7, the least upper bound of f(A), is attained by an element 
of A. This finishes the proof. | 


Corollary 4.3. Given f and A as above, there is an element b € A such that 


f(x) = f(D) for all x € A. 


Proof. Exercise 4.5. | 


This completes the analytic preliminaries, with the exception of one more 
comment. The equation x” = a for a € C always has a root in C. First of all, 
if r > 0, then %/r exists as a real number. Then de Moivre gives 


0 0 
Vr (cos = +isin-) 
n n 


as a root, where a = r(cos@ + isin®). Thus the proof of the fundamental 
theorem of algebra that follows also uses the existence of sin x and cos x. Can 
you show that x” = a@ has a root without trigonometry (and of course, without 
already knowing that C is algebraically closed)? 


4.3 First Proof of the Fundamental Theorem 
of Algebra: An Analytic Approach 


This proof is adapted from the classic calculus text by Edmund Landau [46]. 
The only facts we need from analysis are the theorem that continuous func- 
tions on compact sets attain their maxima and minima and de Moivre’s theo- 
rem. 

The fact that C is algebraically closed will follow immediately from the 
following two lemmas. 


Lemma 4.4. Given a polynomial f (x) € C[x], there is a complex number a 
such that | f (x)| = |f(@)| for all x €C. 


Lemma 4.5. /f f is a nonconstant polynomial in C[x], then given a € C for 
which f(a) #0, there exists B € C such that |f(B)| < |f(a@)|. 


Lemma 4.4 is not immediate. We cannot apply Lemma 4.2, for although 
f (x) is a continuous real-valued function, the domain C is not compact (it is 
closed but not bounded). However, it is the easier of the two lemmas: 
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Proof of Lemma 4.4. We shall use the following fact: for complex numbers 
a and £, one has |a + f| > |a| — |B]. This follows from the triangle inequality 
la + b| < |a| + |b] by replacing a by a + B and b by —£. 

Consider a polynomial f(x), which we may normalize to the form 


f(x) =x" + ay_x" +--+ +49. 
Then by the above, we see that 


F(x) > |x|” - lee +++ aol 


> |x|" (1 - 


an-1 an-2 dao 
fe Fe 
x xn 


If |x| is sufficiently large, then 
have | f(x)| > |x|"/2. 

Now choose |x| larger (if necessary) so that |x|"/2 > |ao| = |f(0)|. With 
this adjustment, we now see that outside some big disk (|x| > R;), we have 


Gn-1 4...4 40| <1 sav, Thus for |x| > R, we 
x x 2 


If(x)|>1F(0)| for x > Ri. 


Inside the closed disk defined by |x| < Ri (which is compact!), there exists 
a such that by the corollary to Lemma 4.2 applied to the continuous function 
|f (x)| on a compact set (the closed disk of radius R;), we have 


f(a) 2 1f(@)| for |x| < Ri. 


Oh, and note that |f(0)| > |f(@)| too, because 0 is in the disk! So, even 
outside the disk (|x| > Ri), we have 


IFC) > IFC) 2 LF(e) I. 


This shows that for every x € C, we have 


f(a) 2 IF (@)], 
which finishes the proof of Lemma 4.4. | 


The proof of Lemma 4.5 is a little more subtle. We will start with a partic- 
ular form for f(x) and then show how to reduce f to this special form. And 
so... we need (yet) another lemma: 


Lemma 4.6. Suppose n > m and f has the following special form: That is, (0) = 1 and 
the lowest-degree term 
besides the constant term 


_,_ym m+1 yo, n 
f(x) =1 Xo + Am+1x + +GnX . has coefficient —1. 


Then there is a real number €, 0 < € < 1, such that |f (€)| < 1. 


Proof. To see this, apply the (generalized) triangle inequality: for every x € 
C, we have 


|f(x)| <|L-x""| + lacie’ | tert |ayx”|. 
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If we restrict x to0 < x < 1, then 1 —- x” > 0 and lane | = |am4i|x""*', so 
we can make this inequality even stronger: 


m+2 


Ogle a" slau + lake eee ale. 


Factor out x”: 
[f(x)| < 1-2" [1 = (ladmailac! + lamnz|x? +++ + lanl”) ] 
= 1-x’"[1-B(x)]. 


We can now choose x close enough to zero to make 0 < B(x) < 1, and so we 
have 0 < 1 - x’"[1 - B(x)] < 1. Done. a 


And now ... onto Lemma 4.5. It has been a while, so recall what we want to 
prove: 


If f is a nonconstant polynomial in C[x] and @ is a complex number 
for which f(a) # 0, we want to show that there exists 8 € C such that 


IF(B) < |FC@)I- 


Here we go... 
Proof of Lemma 4.5. Suppose now that 
Sf (X) = gt yy x" +005 + yx". 
Let 
in oh og Mgt 


treet x", 
dao dao 


g(x) =A) 214 
ao 


Now, to make this look like the special case, we construct 


«( el ‘| = 1-.x’" + terms of higher degree , 


am 


where %/ = is any one of the roots of x’" — “° = 0 in C (de Moivre!). So, 


am 


by our special case, there is a real number & such that 


(VE 
am 

(VE 
am 


So there exists t such that 


or since g = £, 


< ao = f(0). 


If(7)| < FO). 
Almost there. One more clever transformation: Suppose f(a) # 0. Consider 
the polynomial in C[x] defined by h(x) = f(a@ +x). Then by Lemma 4.6 
applied to h, there exists t’ such that 


f(at+t’) =h(t’) < h(0) = f(a). 


Bingo: We have built a 6 that the lemma promised. | 


4.4 Background from the Theory of Equations 


This completes the proof of Lemma 4.5 and our first proof of the funda- 
mental theorem of algebra. Notice that besides using the existence of minima 
for continuous functions on compact sets in R x R, we have used de Moivre’s 
theorem to find mth roots of —ag/am. In later, more algebraic, proofs we still 
need to use properties of continuous functions on compact sets, but we will 
need to assume only that if a ¢ C, then there exists 8 ¢ C with 6” = a. This 
can be shown without de Moivre. Nevertheless, the above proof is extremely 
elegant and is just about the simplest we know. 


4.4 Background from the Theory of 
Equations 


The proof of the algebraic closure of C presented in Section 4.3 leaned heavily 
on classical analysis and the order relation in R. Furthermore, we used the 
extension of the absolute value to C and the triangle inequality in C. The 
only analytic fact used in the proof developed in this section is that every 
polynomial of odd degree in R[x] has a real root. All the rest of the argument 
is algebraic. 

A very old result from algebra is the theorem on symmetric functions. It 
played an important role in the classical development of the theory of equa- 
tions and Galois theory. We shall develop the result in a little more generality 
than we need. 

Consider an integral domain D and n indeterminates xj,...,x,,. As usual, 
D[x1,...,Xn] denotes the ring of all polynomials in x),...,x, with coeffi- 
cients in D. We are interested in the following special polynomials: 


OL =X, + xXQF + +X, 


O2 = X{XQ + XXZ HH XIXy + XVWXZ + XVWX4 Fe = yes 
i<j 


03> > XiXjXkK » 
i<j<k 


On = X1{X2°°'*Xn- 


These are called the elementary symmetric functions. For example, consider 
D[ x1, X2, x3, x4]. Then 


O, =X, +%.+%34+X4 (the sum), 
O2 = Xj X2 + X1X3Z + Xp X4 + X2N3 + X2X4 + X3xX4 (the sum two at a time), 
03 = Xj X2X3 + Xp X3X4 + X2xX3X4 (the sum three at a time), 


04 = X1X2Xx3xX4 (the “sum” four at a time). 


The polynomials o1,...,0, all satisfy the important property that each 
remains the same when the variables are permuted in any of the n! different 
ways (recall that a permutation of {1,2,...,1} = T is a one-to-one mapping 
of T onto itself). In fancy language, they are invariant under the action of 
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the symmetric group S,, on n letters. However, there are other polynomials 
invariant under S, as well. For example, i + xe feet - is invariant. Such 
polynomials that are invariant under the action of S, are called called sym- 


metric. But 


(ea dtal Expt tax, t2 > xix;, 


i<j 
a 2 rn 2 2 
and so xj +-++:+ x;, = oy — 20, and we see that xj + --- + x; belongs to 
the ring D[o\,02,..., 0] of all polynomials in 0), 02,..., 0. The theorem 


on symmetric functions states that this is always the case, namely that every 
symmetric polynomial can be written as a polynomial in the elementary sym- 
metric polynomials. 


Theorem 4.7. Let f(x1,...,%n) € D[x1---xXn], and suppose that 
f ior sstn) =f Os Cate leo) 
for every permutation o of the integers 1,...,n. Then 
f (%1--Xn) € DLO, 02,---, On]. 


There are a number of proofs of this result. We give a proof that supplies a 
little more information. First of all, we need to calculate the famous Vander- 
monde determinant. 


Lemma 4.8. 
1 Xx] ee a 
1 x2 oo xn 
: ‘ = [1G — x;) 
: . i>j 
1 Xn a 


Proof. By replacing the ith row by itself minus the jth row and noticing 


that x; — x; divides x? — xj , we see that x; — x; divides the left-hand side. 
But the various x; — x; for different pairs (i,j), i > j, are relatively prime. 
Hence the right-hand side divides the left-hand side (here we used the fact 
that D[x,...,xX,] is a unique factorization domain). But the left-hand side 
has degree n(n -1)/2, since 1+ 2+3+---+n-1 = n(n-1)/2, and so 
does the right-hand side, since ()) = n(n - 1)/2. Therefore, they differ by a 
constant. As an exercise, show that the constant is 1. 

Another proof, perhaps simpler, is to continue to operate on the rows and 
columns. We proceed by induction on n. Check the lemma for n = 1 and 2. 
Then if we subtract from each column, beginning with the second, x; times 


the column preceding it on the left, we obtain 


1 0 0 te 0 
1 x2-Xx, x5 — x1 x2 x4 re mige 
1 XxX,- x1 i — X1Xn a —XxX1xp 7 


4.4 Background from the Theory of Equations 


But now we can kill all the ones in the first column except the top one by 
subtracting the first row from each of the other rows. Then you see that the 
resulting rows have x2 — x1, x3 — X1,..-,Xn — x; aS a common factor, and so 
we have the whole determinant equal to 


10 0 0 -:: 0 
2 eee n-2 
Ge Gee e a Ae 
0 1 xX» ae te ae 
= (x2 — X1)+(Xn — 1) I] (x; — x;) 
ie 


by the inductive hypothesis. But this is simply 


[] Gap: 


i>j 
ij<n 
and that finishes the proof. | 
Now consider the integral domain D[x1,...,x,], which contains the inte- 
gral domain D[o1,...,»] as a subdomain: 
D[x1,.--, Xn] 
D[o,..-,On]- 


Build the polynomial in a new indeterminate z given by 
(=si) ey (ean), 
This polynomial is simply 
n n-1 n-2 n 
za" + 09z" ~ ---+ + (-1)"on 


(verify this). It follows that each x; satisfies the relation 


Vb ox? —...4(-1)"o, = 0. 


Xf -— ox? 
This shows that x; is algebraic over the ring D[oy,...,0n]- 

The following lemma shows that the above polynomial is the minimal 
polynomial of each x; as an algebraic quantity over the field of symmetric 
functions. We state it only for x;, but the proof is general in principle. Let $ 
denote the subring of D[x1,...,x,] consisting of all symmetric polynomials 
M X1,...,Xyn. 


Lemma 4.9. /f 


ane - Gone +++++a)9=0 


for ao, a1,..-,An—1 € S, then ag = a, =+++ = ay_-1 = 0. 


In other words, 

1, x, x2,...,x77! ar 
linearly independent 
over S. 


e 
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Proof. Since 0 is symmetric, the right-hand side is symmetric. Any permu- 
tation that sends x; to x; fori = 1,...,n leaves ao,...,day—1 fixed by the 
definition of S. Hence fori = 1,...,n, we have 


An Xj" | + An_2xi" 7 +++ +g =0. 


Hence (dn-1, 4n-2,---, 40) is a solution vector to the homogeneous system of 
linear equations whose coefficient matrix has determinant 


But Vandermonde tells us that this determinant is [];,;(x; — x;), which is 
nonzero. Hence this matrix is nonsingular, and (dy_1,4y_2,---,49) must be 
(identically) the zero vector. This finishes the proof. | 


Now we can prove the symmetric function theorem (Theorem 4.7), restated 
more simply as follows. 


Theorem 4.10 (Symmetric function theorem). Let D be an integral domain 
and S the ring of all symmetric polynomials in D[x1,...,Xn]. Then S is equal 
to D[o1,..., On], the ring of polynomials in the first n elementary symmetric 
polynomials. 


Proof. Check that the right-hand side is contained in the left-hand side. In 
the other direction, let f(x1,...,%n) € S. View f(x1,...,Xn) as a polynomial 
in x2,...,X, with coefficients in D[x;]. We prove the theorem by induction 
on n. In particular, we shall prove that a symmetric polynomial in n variables 
over any integral domain is a polynomial in o},..., 0p. 

The case n = 1 is self-evident: every symmetric polynomial in D[x] is a 
polynomial in the symmetric polynomial x. 

Now assume the result for n — 1. Then f(x,...,%n) is certainly symmet- 


ric in +,.+.,% (we ignore x1). Hence f(41,..%n) = @(404,-+-50,-4)s 
where g(X1, X2,..., Xn) is a polynomial with coefficients in D, and where, of 
course, 


OL =X2+X3+6°+Xn, 


I = . . 
074 = » XiX; > 
i<j 


i,j22 
_ 
Oy, = X2°*Xn- 


However, there are simple relations between o1,...,, and 7%, aap OC. 


4.5 Second Proof of the Fundamental Theorem of Algebra: All Algebra (Almost) 


In fact, 
T 
O, =X, +01, 
02 = X10, + 05, 
a 1 1 
On-1 = X10,-9 + Ops 
T 

On =X10p-1 - 
Thus each oj, 04,...,0/_, is seen to be a polynomial in x),0),...,0, by 
successive substitution. Hence by substitution, g(x, 07,...,0/_,) becomes 
a polynomial h(x, 0),..., 0). Now recall that x, satisfies a relation 


x? — oxy" 1 +--+ 4+ (-1)"on = 0, 


and therefore, solving for xj’ and repeatedly substituting, we may lower the 
degree of h(x1,01,...,0n) until it has degree less than n in x;. Write the 
resulting polynomial as 

i + axe ++ y-], 
where ao,...,4n—1 are in D[o},...,0,]. Recall that this polynomial is still 
g(x1,..-,%n), which is in S. Hence 


n-1 2 


aox, + ayx"* +--+ + (dn_-1 - g) =0 

is a polynomial in x; with coefficients that are symmetric. According to 
Lemma 4.9, every coefficient is identically zero. In particular, a,_; — g = 0, 
or g = dy_1. Thus g € D[oj,..., 7]. This completes the proof. | 


From the point of view of Galois theory we have done the following. The 
symmetric group on n letters S, is a group of automorphisms of the ring 
R = D[x,,...,Xn]. The subring of symmetric polynomials is simply the fixed 
ring of the group S,,. Denote that fixed ring by Rs,,. Then Theorem 4.10 above 
states that Rs, = D[o1,02,...,0,]. The general procedure of descending 
from a ring with a group operating on it to the fixed ring and investigating the 
relationship between the group and the rings in between the top one and the 
fixed one is an extremely important source of information. It is the basic idea 
behind Galois theory. 


4.5 Second Proof of the Fundamental 
Theorem of Algebra: All Algebra (Almost) 


With the aid of the symmetric function theorem we will give another proof 
that C is algebraically closed. However, we need one more construction from 
abstract algebra. In this proof, we need to know that there exist roots some- 
where. In other words, we need the fact that if F is a field and f € F[x], 
then one can construct a field E in which f(x) has all its roots. Although 
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the construction of such a field is not particularly difficult, we will assume its 
existence and include a sketch of the proof in the next subsection. For now, 
we assume the following result. 


Theorem 4.11. Let F be a field and f(x) a polynomial with coefficients in 
F. Then there is a field E such that E > F and f(x) = (x -— a1)+-(x - Q@) 
with a,...,Qn€E. 


Now we prove our main result. 
Theorem 4.12. The field of complex numbers C is algebraically closed. 


Proof. We will show that every polynomial with real coefficients has a com- 
plex root. As we mentioned earlier, in Section 4.1, this is sufficient. The proof 
goes by induction on m, where deg f = 2’"n and n is odd. If m = 0, the result 
follows from the intermediate value theorem. 

So, take a polynomial f € R[x] of degree 2’”"n, where m > 0. We want to 
show that f(x) has a complex root. Consider f(x) as a polynomial in C[x] 
and let E be a field containing C that contains all the roots of f. 

Let s = 2’"n and f(x) = J]}_,(x—-@;), where the q; are the roots of f. Fix 
a positive integer h and form the polynomial whose roots are a; + aj + haja; 
fori < 7. Namely, 


g(x) = II(« = (0; +07 + ha;a;)) . 


i<j 


The coefficients of g are symmetric in a},...,@,. Hence by the symmetric 
function theorem, they are polynomials (with coefficients in R) in the ele- 
mentary symmetric functions of a1,...,@,. But these elementary symmetric 
functions are the coefficients (up to +1) of f(x). Hence g(x) € R[x]. Now, 
deg g(x) = (5) = 9(s - 1)/2 =2”"'n’, where n’ is odd. Hence the power of 2 
dividing deg g(x) = () is less than m. By the induction hypothesis, g(x) has 
a root in C. 

The idea is to vary the integer h. The induction hypothesis implies that 
for each h, at least one of the a; + a; + ha;q; is in C. So if we let h take 
on more than (5) integral values, there must be two of the a; + a; + haj;aj; 
that are in C with the same i, j. Thus there are distinct integers h; and h2 
such that a; + a; + hja;a; € C and a; + a; + hoaja; € C for some pair (i, j). 
By subtraction, we obtain (1 — h2)aj;a; € C, and so aja; € C. This implies 
a; + a; € C. Then 


(a; a aj)° = aj + aj” — 2ajaj = (a; +aj;)° — Aaja; € Cc, 


whence a; — a; € C, since C certainly contains all its square roots (Exer- 
cise 4.6). But then a; + a; € C implies that a; ¢ C. This completes the 
proof. | 


Let’s review the ingredients of this proof. We needed the intermediate 
value theorem to get the induction going, the symmetric function theorem 
to build the polynomial g, and a general lemma from field theory that ensures 


4.5 Second Proof of the Fundamental Theorem of Algebra: All Algebra (Almost) 


the existence of an abstract extension field that houses all the roots of f. All 
the analysis is tucked away in the statement that polynomials of odd degree 
with real coefficients have at least one real root. 


4.5.1 The Idea behind the Proof of Theorem 4.11: 
More Modular Arithmetic with Polynomials 


We have used the fact that if F is a field and f € F[x], then one can construct 
a field E that contains F in which f(x) has all its roots. The fact that there is 
such a field somewhere up in the sky is due to Kronecker, and it deserves to 
be celebrated as a theorem: 


Theorem 4.13 (Kronecker). Let F be a field and let f € F|x] be a non- 
constant polynomial. Then there exist a field extension E/F and an element 
u€ E such that f(u) =0. 


A complete proof is in [19, Chapter 7]. But the idea is the same one that 
we used in the discussion of modular arithmetic with polynomials in Sec- 
tion 3.1. More precisely, we can assume that f is irreducible (and monic) 
of degree n, and then construct the ring E obtained by reducing elements of 
F [x] modulo f. The construction is a little intricate, but (to oversimplify) the 
steps involved include showing the following: 


(i) The ring E obtained by replacing each element of F[x] by its remainder 
on division by f is given a ring structure (exactly as we built Z, from Z). 
(ii) That makes F into a field that contains (an isomorphic copy of) F. 


(iii) The image of x in E is a root of f in E (think of the image of x when 
polynomials in R[x] are reduced modulo x? + 1). 


(iv) If g(x) € F[x] and zis a root of g in E, then f | g in F[x]. 


(v) E is a vector space over F, the set { 1, x, x... a } is a basis, and 
dime E=n. 
As an example, in Lookout Point 2.5, we did a little arithmetic in Q[és |, 
and you checked that 


1 1 
Pa paop ae PO et 8): 


As we Said there, this value of (eo ae x 2c)" didn’t drop out of the sky. 
Here’s the secret to the story: 
The minimal polynomial for Z5 is 


z 


O(x) = x44 x3 4x7 4x41. 


Hence, the Kronecker construction of the field that houses a root of ® is 
obtained by reducing polynomials in Q/x] modulo ®. So use Euclid’s algo- 
rithm in Q[x] to compute 


gcd (ze =x + 2x, ©) , 


A main message here 

is that the same ideas 
that lead from Z to Zp 
can be used as a method 
that guarantees roots of 
equations. That’s quite 
wonderful. 
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You will get 11/49, a unit in Q[x] (try it). Using the functions s and t defined 
in Lookout Point 3.3, find that 


4 7 : 11 11 
RLS inn (ie cae 2x) + as + O(x)=—. 
7 49 49 49 7 49 49 49 


So 


x 9x? 8x 3 
7 49 49 49 


Multiply both sides by 49/11, and the secret is out. 


Exercises 


4.6 (i) Without using de Moivre, show that every nonzero complex number 
a + bi has two square roots in C. 


(ii) Use the identity 
(aj - aj) =a; + a; — 2a;a; = (a; + a;)° — 4aja; 


to derive the quadratic formula without “completing the square.” 


4.7 Let ‘¥(x) = x7 + 1. Reduce each polynomial mod ¥: 


(i) 3+2x (ii) 3+ 2x +x? 
(iii) 3+ 2x 4+ 27423 (iv) 34+ 2x +x7 +2? 
(v) x? (vi) x4 
(vii) x!8 (viii) x!0! 


4.8 Find the multiplicative inverse of each polynomial modulo x? + 1: 


G) 34 2x Gi) 34 244 x7 

(iii) 3+ 2x +27 +29 (iv) 3+2x4+x7 4x3 + x4 
(v) x? (vi) x* 

(vii) x4 (viii) a+ bx 


4.9 Let p(x) = x7 + x +1. Reduce each polynomial mod p: 


(i) 3+2x (ii) 3+ 2x +x? 

(iii) 3+ 2x4 27423 (iv) 34+2x 4x24 x3 + x4 
(v) x3 (vi) x* 

(vii) x!8 (viii) x!0! 


4.10 Find the multiplicative inverse of each polynomial modulo x? + x + 1. 


(i) 3+2x (ii) 3+ 2x +x? 
(iii) 3+ 2x +27 +23 (Gv) 342x447 42° 
(v) x3 (vi) x* 


(vii) x10! (viii) a+ bx 


a ee, 


4.6 Galois Theory and the Fundamental Theorem of Algebra 


4.11 In Q[%], express 


1 
3475 +202 


as a linear combination of powers of ¢5 with coefficients in Q. 


4.6 Galois Theory and the Fundamental 
Theorem of Algebra 


The fundamental theorem of algebra follows quickly from the fundamental 
theorem of Galois theory and a few elementary facts about groups. While 
this may seem a little heavy-handed, it is nevertheless instructive to obtain 
special facts such as C being algebraically closed from more general ones. 
The following proof that C is algebraically closed is taken from Artin’s paper 
of 1926 with Otto Schreier [2]. The fundamental theorem of Galois theory 
as stated in modern language leaves little trace of its origins in the theory of 
equations. It goes as follows: 

Let F be a field that contains Q and let E > F be a field containing F such 
that E is a finite-dimensional vector space over F. If a € E, let f(x) be the 
minimal polynomial for a. Recall that this is the unique monic irreducible 
polynomial that has @ as a root. If for each a € E, f(x) has all its roots in 
E, i.., f(x) splits into linear factors, we say that E is a Galois extension 
of F. The Galois group of EF over F is the group of all automorphisms of E 
that leave each element of F fixed. Denote this group by G(E/F) = G. It can 
be shown that G is a finite group of order equal to dim E. The fundamental 
theorem of Galois theory states that there is a one-to-one correspondence 
between subgroups of G and subfields of E containing F’. The correspondence 
is quite explicit. If E > EF, > F, where E; is a field, then the corresponding 
subgroup of G is G(E/E}), the automorphisms of E that leave E) fixed. The 
inverse correspondence associates to each subgroup H c G the field Ey of 
all a in E with h(a) = a for all h € H. The field Ey is called the fixed field 
of H. One can see that G(E/E;,;) = H. Finally, H is a normal subgroup of 
G if and only if Ey is a Galois extension of F. (A subgroup H c G is a 
normal subgroup if ghg~' ¢ H for each h € H and g € G; see Example 4 in 
Section 2.6.) 

In order to obtain the fact that C is algebraically closed, consider a poly- 
nomial f(x) € R[x]. Then let E > C be a Galois extension of R in which 
f (x) has all its roots. That such a field exists follows from Kronecker’s the- 
orem (Theorem 4.13). We want to show that EF = C. If G is the Galois group 
of E/R, suppose that G has order n = 2'm with m odd. By Sylow’s theorem 
(Theorem 2.33 in Section 2.6), there exists a subgroup H of G of order 2°. If 
Ey is the fixed field, we have the diagram in Figure 4.1. 

Since the fundamental theorem of Galois theory implies that the dimension 
of E over Ey is 2’, we see that Ey is an extension of R of odd degree. By 
Section 4.1 of this chapter, Ey = R. Thus E/R has degree 2’, and m = 1. 
Hence the Galois group of E/C is of order 2’~!. If E + C, then there would 
be a subgroup J in G(E/C) of order 2’~*, again by Sylow. Then the fixed 


It is possible, as Emil 
Artin has shown, to 
develop Galois theory 
without the symmetric 
function theorem. 


Where is the ‘analytic 
step” in this proof? 
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Figure 4.1. The “hypothetical” extension E/C. 


field E77 is a quadratic extension, i.e., an extension of degree 2, of C (again by 
Galois theory). That is impossible, since all quadratic equations over C have 
their roots in C. This finishes the proof (applause). 

The use of Galois and Sylow enabled us to collapse the “hypothetical” 
extension E to C. Again, the analytic part of the proof is contained in the 
statement that R has no field extensions of odd degree greater than one. 

You should by this time be drooling to read a good account of Galois 
theory. There are quite a few excellent books. Especially recommended: 


(i) Lisl Gaal’s Classical Galois Theory [28]. 


(ii) Pierre Samuel’s beautiful introduction to algebraic number theory, Theo- 
rie Algébrique des Nombres, translated into English by Allan Silberger [76]. 


(iii) Joseph Rotman’s Galois Theory [69]. 


(iv) Jorg Bewersdorff’s Galois Theory for Beginners: A Historical Perspec- 
tive, translated into English by David Kramer [6]. 


(v) And the best simple introduction to Galois theory is still Artin’s Galois 
Theory [1]. 


4.7 The Topological Point of View 


In this section, we examine the fact that C is algebraically closed from the 
point of view of topology. Thus we view C as the metric space R x R with its 
Euclidean topology. We know, for example, that RxR is connected, complete, 
and locally connected. A polynomial f(x) can be viewed as a mapping from 
C to C. The fundamental theorem of algebra then states that for every non- 
constant polynomial f(x), zero is in the image f(C). But this is equivalent 
to the assertion that f is an onto mapping, i.e., f(C) = C. For suppose we 
know that C is algebraically closed. Let a € C and consider f(x) - a = g(x). 
Then g(a) = 0 for some a € C, and so f(a) = a. Conversely, if f(C) = C 
then f(a) = 0 for some a. Hence the fact that f(x) is onto implies that C is 
algebraically closed. 

We will use some vocabulary and basic results from topology. A good 
introduction is the book by Steenrod and Chinn [4]. 


4.7 The Topological Point of View 


An important class of functions studied in complex function theory is the 
class of functions holomorphic on an open set U c C. These functions are 
characterized by the property of having, near each a «€ U, a representation as 
a convergent power series, or equivalently, of having a complex derivative at 
each point. A basic result in the subject states that a holomorphic function 
is an open map. This means that if V c U, V open, then f(V) is open in C. 
Since a polynomial is a convergent power series defined everywhere on C, it 
follows that f(C) is open. 

And f(C) is also a closed set. For let f(é;) > @ for some a € f(C). 
Then {;} is a bounded sequence, since we saw that |f(x)| is arbitrarily 
large outside arbitrarily large disks (see Section 4.2). If &, is a convergent 
subsequence, then f(é;,) > a and f(é,) > f(), where é,, > yu. Hence 
a = f(y), and since f(C) contains all its limit points, it is closed in C. 

We know that C is a connected space, which means that C cannot be writ- 
ten as AUB where A and B are nonempty, disjoint, and open. Or equiva- 
lently, C has no nonempty proper subset that is at the same time open and 
closed. This may be derived by combining the facts that R is connected and 
the topological product of two connected spaces is connected. Since f(C) is 
not empty, we must have f(C) = C, and that proves our theorem. 

A slight variation on this argument goes as follows. We learned in Sec- 
tion 4.2 that | f(x)| has an absolute minimum, say |f(@)|. Suppose | f (@)| # 0. 
Then since f is open, there is a little disk centered at f(a), not containing 0, 
that must be covered by f, i.e., every point in the little disk must be an image 
point of f. But there are points closer to 0 than f(a) is. This contradicts the 
minimality of |f(a@)|. Hence f(a) = 0. 


Take It Further 


A very intuitive proof is possible if we consider the image under f of a closed 
curve I in C, as pictured in Figure 4.2. 


fT) 


Figure 4.2. The image of I under the function f. 


Consider a circle I. of radius r in C and the simple polynomial z” for 
n > 1. What is the image f(I;)? It is just a circle of radius r”. But since 
multiplication of complex numbers adds the angles, we see that the image 
circle is traversed n times as z goes around the circle of radius r once. We say 
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A polynomial is a “short” 
power series. 
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that n is the winding number of z". 

Now, z” is a particularly simple example, but it plays the leading role in 
this proof. Consider a polynomial f(z) of degree n and view it as a mapping 
of circles of various radii in one plane to curves in another plane as illustrated 
in Figure 4.3. 


Figure 4.3. The image of a circle under the function f. 


If f(T) does not contain the origin, we may speak of the number of times 
the image goes around 0 as z traverses the circle I. once in a counterclock- 
wise manner. We count algebraically, so if the image point goes around clock- 
wise, we subtract 1. Call this number W,.(f). It is defined for every contin- 
uous function f. Although the picture is tempting, it isn’t immediate how to 
define such a number rigorously. The idea is to vary the angle of f(z) contin- 
uously and show that the total variation is an integral multiple of 27. If you 
don’t want to do that, then complex integration gives a good definition as 


1 f' 
Qni Jr, f° 


However, let us proceed intuitively. If you think about it, you will see that the 
“winding” number varies continuously with r. What we shall do is to assume 
that f(z) has no zero, so that W,(f) exists for all r > 0. Then we examine 
W,-(f) for large r and see that it is n, and examine it for r small and see that 
it is zero. But W,(f) varies continuously with r, and since R is connected, 
it cannot jump from 0 to n. That is a contradiction, and so f(z) must have a 
zero after all. 

Now let us put in a few details. First consider f(z) for large r. Since f has 
no zero, we must have a, + 0. Write f(z) as 


FQ) =z ai" tay a2" (14 ae BY, 
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Since angles add, we see that 
a a 
W,-(f) = W,(z") + W, (1 rigee ores “) 


a, a 
=n Wy (14 peg BY, 
v4 at 


But when |z| is very large, we have that 


a\ an 
—te + . + ‘— 
Zz git 
is very small, and 
1 a 
L+—te¢— 
Zz ge 


a a 1 
Pee ea ile, 
Zz gt 2, 
then 
a, a 
1+—+ alls 
Zz ie 


is confined to a disk of radius 1/2 around 1. It follows that 


w,(1+ 24-4) 
z Zz 
is zero for r large, and hence W,(f) =n for r large. 

Now we look at W,(f) for r small. As z goes around a little circle of 
radius r, 1/z goes around a big circle of radius 1/r in the opposite direction. 
So let us just calculate the winding number of the composition of f with 1/z. 


That is, 
1 1 i 
f(s)-a+ Se tetag = 2"(14 Be eB), 
gee gh Zz are 


has winding number —n. However, z” still has winding number 7, so the 
total winding number is 0. Another way, perhaps simpler, of seeing this is 
to observe that 

lan| 


+06 +Ay—12| < 2 


2” + az! 


for |z| small. Hence |z” + a;z""! + --- + ay_1z + ap| stays in a disk of radius The integer W(f) is 


|an|/2 about a, # 0 for |z| < €. Hence the winding number is zero. And that — Célled the Brouwer degree 

; of the mapping, and its 
does it! general properties are 
developed in many texts 
on algebraic topology. 
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And after working through 
some of the references, 
you might consult [86]. 


Expression (4.1) describes 
a polynomial with com- 
plex coefficients. We 
want more: we will 

show that ‘Y;, has integer 
coefficients. 
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This concludes our excursion into the fundamental theorem of algebra. 
But there’s one more point that needs to be stressed: As we have seen, the 
fundamental theorem is a basic fact about C, and all proofs must involve 
in some way the fact that R is a complete Archimedean field. This so-called 
analytic step showed up in each of our proofs, and it can be shown that it needs 
to be there. Otherwise, we could have modified our proofs to show that QJi] 
is algebraically closed, and that is just not true. The simplest invocation of 
analysis is in the algebraic proof, where it is used to ensure that polynomials 
of odd degree in R[x] have a root in R. 


Exercises 


4.12 Show that if you assume the fundamental theorem of algebra, polynomi- 
als of odd degree have a real root. What is wrong with your proof? 


4.8 Supplement: x” - 1 and Its Factors 


The polynomials x” —1 (na positive integer) just might be the most important 
(read useful) class of polynomials in all of modern algebra. We have used 
these polynomials so far to investigate the geometry of regular polygons and 
the structure of Pythagorean triples. The roots of x” — 1 (the “nth roots of 
unity”) were an important part of the story when we looked at primes in an 
arithmetic progression, primitive elements, and Fermat’s last theorem. They 
come up in field theory, group theory, analysis, and topology. Much of the 
success of x” — 1 comes from the fact that its factors and roots have algebraic, 
geometric, and arithmetic interpretations. The irreducible factors of x”—1, the 
cyclotomic polynomials, have fascinated mathematicians ever since Gauss’s 
Disquitiones Arithmeticae [29], and there is a vast literature that digs into 
their properties (see, for example, [12], or do a Google search). 

We will just scratch the surface here. More precisely, there are three pur- 
poses for this supplement. 


(i) We will fulfill the promise made in Section 2.2: the polynomial Y;, 
defined by 


¥n(x)= TT (x~ én) on 
G.n)=1 
has integer coefficients and is irreducible in Z| x]. In fact, we shall prove 


the following theorem. 


Theorem 4.14. The polynomial ¥,(x) is the minimal polynomial in Z[x] 
for 


2... 
fn = COS +7 sin 
n n 


(ii) We shall also show that we have the factorization 


x"-1=|[| a(x) 
d|n 
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in Z[x], where the product is over the positive integer divisors of n. 


(iii) And we shall investigate the structure of the various ’,,: their degrees 
and coefficients. This will make for some nice connections and (maybe) 
a surprise or two. 


Let us break up our story into smaller pieces. 


First: The Factorization of ¥,,(x) in Z[x] 


Every root of x” — | has an order d that divides n. Conversely, if d | n, then 
a primitive dth root of unity satisfies x” — 1. Since the x — ¢/ are relatively 
prime in C[x], we get the second claim in Theorem 4.14: 


Lemma 4.15. 


x =1=|[ Pals). 


d|n 


(4.2) 


Second: ¥,(x) € Z[x] 


Equipped with the basics of group theory, we can restate some of the results 
from Chapter 2. The solutions to x” = 1 in C form a cyclic group of order 
n with generator € = cos 2m + isin an . The other generators are then ¢/ for 
1 < j <n with j relatively prime to n. These are the primitive nth roots of 
unity—the primitive elements in the group. Our polynomial Y;, is thus the 
monic polynomial whose roots are the primitive nth roots of 1. 

The degree of ¥,,(x) is 6(n). If n = p, a prime, then ¥,,(x) = x?~!+x?-7+ 
++» +x + 1 (Theorem 2.11 in Section 2.2), so the coefficients are certainly 
integral. But in general, we don’t have such an explicit expression (yet). 

We will need Gauss’s lemma. 


Lemma 4.16 (Gauss). Let 


a; ¢€Z, 
b; eZ, 


f(x) = ayx" +--+ +0, 
g(x) =byx™ +--+ + bo, 
be polynomials such that an, ...,a9 have no common prime divisor, and simi- 


larly for by, ..., bo. Then f (x) g(x) = Chamx"*™ +++++0C0, Where Crim -- 
have no common prime divisor. 


+2 CO 


Proof. Suppose, to the contrary, that f(x)g(x) = ph(x) for a prime p. Then 
reducing modulo p gives f(x)g(x) = 0 in Zp. Since Z,[x] has no zero divi- 
sors (the whole point of the proof!), it follows that f(x) = 0 or g(x) = 0. If, 
say, f(x) = 0, then p divides each of the coefficients of f, a contradiction. = 


A corollary of Gauss’s lemma is sometimes more convenient to use. 


Corollary 4.17. A polynomial f € Z[x] is irreducible in Z| x] if and only if 
f is irreducible in Q{x]. 
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In Section 2.2, we denoted 
the number of integers j 
such that (j,) = 1 by 
O(n). 


Try working this out with 
two specific polynomials. 
It will feel very much like 
the argument we used for 
Eisenstein. 


As usual, f is the polyno- 
mial obtained by reducing 
the coefficients of f 
modulo p. It lives in 


Zp [x]. 
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Our proof will be the 
briefest of sketches, but 
you can fill in the details. 


Thanks to Eisenstein, if p 
is prime, we already know 
that ‘Y,, is irreducible. 
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Back at the ranch, we have a lemma to prove. 
Lemma 4.18. 'P,,(x) € Z[x]. 


Proof: We use induction on n. When n = 1, we have ‘P; = x — 1, which is 
surely in Z[x]. When n > 1, we write 


x= 1= 9, (x) [| Vale) = Bf). 
Ae 
By the inductive hypothesis, f(x) is in Z[x]. But x” — 1 is in Z[x] as well, 
and therefore, ‘¥,,(x), which by definition is a polynomial with complex coef- 
ficients, must in fact have real, indeed rational, coefficients. Clearing denom- 
inators, we can write Y,(x) = tWn(x), where b € Z, b> 1, ¥n(x) € Z[x], 
and the coefficients of ¥,,(x) have no common prime divisor. Since f(x) is 
monic by construction, its coefficients also have no common prime divisor. 
Thus f(x) and ¥;,,(x) satisfy the conditions of Lemma 4.16, so we may con- 
clude that the coefficients of their product have no common prime divisor. But 
that means that the coefficients of b(x” — 1) have no common prime divisor. 
Hence b = 1, and we have Y,,(x) € Z[ x], as advertised. a 


Third: (x) Is Irreducible 


We can now finish the proof of the main result for this section, which is 
repeated here: 


Theorem 4.14. The polynomial 'Y,,(x) is the minimal polynomial in Z[x] 
for 


Qn .. Qn 
bn = COS +7sin 
n n 


Proof. We have established that ‘¥,, (x) has integer coefficients, and we know 
that ¢, is a root. So “all” that is left is to establish the irreducibility of Y,. We 
will do that now. 

Let f(x) be the minimal polynomial for ¢,; we claim that ¥,, = f. Since 
f is irreducible, this will do the trick. To show this, it is enough to show that 
if p is a prime not dividing n, then f(Z/) = 0. That last sentence requires an 
argument. a 


Lemma 4.19. Using the above notation, if p is a prime not dividing n implies 
that f( CP) =0, then ¥,,(x) is the minimal polynomial in Z{ x | for 


2m. 
bn = COS +7sin 
n 


Proof. The roots of ¥,, are the ¢*, where (k,n) = 1. But such a k has a prime 
factorization involving only primes that don’t divide n. So every root of ‘YP, 
is of the form (cp) , where p does not divide n, and (k’,n) = 1. But the 
hypothesis says that £? is a root of both ¥,, and f. Repeat the argument with 
fn = GP and k =k’ .... This shows that f and ,, have the same roots. Since 


both are monic, they are the same. | 
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Let us now prove our claim that f(Z?) = 0. If f + ¥,, then f | ‘¥,,, because 
the minimal polynomial for Z,, is a factor of every polynomial in Z[x] that has 
én as aroot. Assuming the worst, let us suppose that fg = ‘¥,. Then we know 
that 


x" -1=f(x)g(x)h(x), (4.3) 


where 


A(x) = [i Ya): 
d\n 
l<d<n 


If f(Z?) # 0, then g(Z?) = 0 (because all the other factors have roots that 
have orders less than n, and Z? has order 7), so ¢, is a root of g(x”). Hence 
there exists j(x) € Z[x] such that g(x?) = f(x)j(x). 

Now reduce mod p. By Fermat’s little theorem, as a polynomial in Z,[x], 
we have 


g(x?) = (g(x))’, 


so in Zp[x], we have 


(g(x))” = f(~)i(@) 


and f | g?. Hence f | g. So from 4.3 above, we have 


(f(x)? |x"- 1, 


Hence x” — 1, as a polynomial in Z,[x], has a multiple factor. So it and 
its derivative share a factor (mod p). But the derivative is nx"! and since 
(p,n) = 1, its only factor in Z,[x] is x, which is not a factor of x” - 1. 

The punchline is that f(2?) = 0. It follows that f = ¥,,, and hence ¥,, is 
irreducible. a 


Fourth: Calculating ,, 


We can rewrite equation (4.2) in a form that allows us to calculate the ¥,,: 


x"-] 
W,, = ——_—_.,, 
- [] Ya(x) 
d\n 
d<n 


Using the fact that ‘¥i(x) = x - 1, we have a recursively defined formula 
for Y,,: 


x-1 ifn=1, 


x"-1 
W(x) = 4 ———_.. ifn> 1. 4.4 
[LYe@) . a 
d\n 
d<n 
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You should verify 
that x1? -1 = 
Th ae 1,2,3,4,6,12} Ya(*)- 
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The recursive definition can be programmed into a computer algebra system. 
Doing this (or just doing it all by hand) gives, for example, the following 
table: 


n | Pn(x) 

1 |x-1 

2 |x4+1 

3 | xr? 4x41 

4 | x?+1 

5 |xt¢x3 x7 41 

6 | x7-x41 

T | xo¢axhtxtg x3 42241 
8 | x41 

9 | e427 41 

10) 4 Sete =e 

11 | P49 4 x84 x7 4 x84 4 axtt x3 gx72 41 
12 | eae a1 


Some of these will look familiar to you—they are applications of Eisen- 
stein. The table contains (especially if you extend it to more entries) a candy 
store of patterns. What do you see in it? 

Look at one example: the minimal polynomial for ¢)2. Using the product 


IT(j,n)=1 (x = ii), we want 


(x - £12) (x - 2) (x - G2) (x- G2) - 
The table says that this is equal to x* — x? + 1. Why? Well, 


x? 1 (x-1) (x7 +441) (x41) (2? — x 41) (2741) (04-3741), 
(4.5) 
and we need something of degree four (why?). Hence the last factor is it. Can 
you show directly that x* — x* + 1 is irreducible? 
One more remark: each factor of (4.5) has, as its roots, powers of a primi- 
tive root ¢12 = ¢. It is a good exercise to check that they break up like this: 


(x-1) (x+1) (2? 4x41) (2741) (22-441) (x*-2741) 
t t if 


1 = i 
ft 


if 
a a 


1, V3: V3.1; 
gtr! t(7 3h 


t t t t 
eae gh an ag ae ae te; L”, an aad 


Lookout Point 4.3. A delightful consequence of the fact that deg, = 
¢(n) uses Lemma 4.2 to obtain a result from elementary number theory that 
is often proved without mentioning cyclotomic polynomials, usually in much 
more complicated ways. 


Corollary 4.20. 
n= > o(d). 


d\n 
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Convince yourself that this is true and verify it in some numerical cases. 
It’s fun. 


Fifth: The Coefficients of 


Much has been written about the coefficients of ‘Y,,. We include one exam- 
ple here: the sequence of cyclotomic polynomials gives rise to a wonderful 
“gotcha” example of misleading conclusions based on examining what seems 
to be a large data set. 

If you calculate (or look up, but calculate is better) ’,, for several dozen 
integers n using, say, formula (4.4), you might conclude that the coefficients 
are all 0, 1, or —1. Indeed, that is true for the first 104 values of n. But W495 
contains —2 as the coefficient of x’ and of x*!. 

And it gets worse (or better, depending on your preference). Cleve Moler 
(Google him), founder of MathWorks [52] and the creator of MATLAB, 
wrote a MATLAB program to compute Y’,, for some large values of n. He 
reported: 


For n = 11-13-17- 19-23 = 1062347, the degree of ‘Y,,(x) is 760320. 
The coefficients range from —1749 to +1694. There are 11804 zero 
coefficients. The average coefficient magnitude is 409.9 .... 


Y,,(x) is computed from the ratio of two polynomials, a numerator of 
degree 1105920 and a denominator of degree 345600. It takes about 6 
minutes on my laptop to compute the numerator and denominator, and 
then about 25 hours to compute their ratio using only deconvolution. 


This particular choice for n didn’t come out of the blue. In [80], Jiro Suzuki 
proves the following result: 


Theorem 4.21. If k is odd and if py < pz < ++: < pe is a “front-loaded” 
sequence of primes—the sum of the first two in the sequence is greater than 
the last—and if n is the product of all the primes in the sequence, then W(x) 
has —k + 1 and —k +2 as coefficients. 


Since 105 = 3-5-7, and {3, 5,7} is a front-loaded sequence of length 3, for 
Wi05(x) the theorem predicts the coefficient —2. And it is known that there 
exists a front-loaded sequence of length k for every odd k > 3. 


Lookout Point 4.4. If you have access to a fairly powerful computational 
environment, take a look at the distribution of the coefficients of 'V,, for some 
large values of n. Cleve Moler did it for ‘7693290 mentioned above. The distri- 
bution is given in Figure 4.4. 
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Section 5 of Keith 
Conrad’s expository 
paper [12] is a good place 
to see the lay of the land. 


There are many other 
computable formulas for 
Yn out there. Take your 
pick. 
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x 104 Distribution of coefficients 


0 
-2000 -1500 -1000 -500 0 500 1000 1500 2000 


Figure 4.4. The distribution of coefficients of ‘¥;,, based on a graphic by Cleve Moker, 
used with permission. 


The moral of this story is that mathematical objects are real, and they exhibit 
all of the nuance found in physical phenomena. 


Exercises 


4.13 Develop a formula in terms of n for the number of irreducible factors in 
Z[x] of x" - 1. 


®) 


Check for 
updates 


Irrational, Algebraic, and 
Transcendental Numbers 


We have seen in the last chapter that the field of complex numbers admits no 

further algebraic extensions. Within the field of complex numbers, however, 

there are many numbers that are not algebraic over Q. In fact, the algebraically 

closed field of all algebraic numbers in C is a countable set, for you can check 

that the algebraic numbers over Q that have a minimal polynomial of degree 

n are countable. Letting n vary gives a countable collection of countable sets, 

which is therefore countable. Let us agree to call a complex number that 

is not in Q an irrational number, so that an irrational number need not be 

real. For example, i is irrational. The irrational algebraic numbers are the 

algebraic numbers whose minimal polynomials have degree greater than one. 

Thus a real root of x° + x +1 is a real irrational number. A real number 

is rational if and only if its decimal expansion is eventually periodic. Thus 

101001000100001 ..., with an ever increasing number of 0’s between the 1’s, 

is irrational. 
Many interesting numbers occur naturally in higher mathematics. They 

may arise as roots of polynomials, as in the case of algebraic numbers, or as 

values assumed by the various functions of classical analysis. Such numbers Google Serge Lang. And 

may be roughly called “classical” numbers (S. Lang, Ltd.). seclee): 
Several of the most important of these functions are 


ee 
cosx=1-—+—--, 
2! 4! 
; eae 
sinx =x-—+—--, 
3! 5! 
: Pe 3 
e=l+x+—+—+4-5, 
2! 3! 
ad — eo 


+ Bata 
2” 3422 2632” 
id 
=l+—+—+H+-: >1), 
gayets dete (on 
Xx Xx Xx 


+ ay 
2 3 4 
T(x) = ‘ pole at. 
0 


In(1 +x) =x 


A complex number is said to be transcendental if it is not algebraic over Q. 
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We shall show that 


£(2) = % in Section 6.1. 


Since “almost every” real 
number is transcendental, 
and since algebraic 
numbers arise in a 

very special way, most 
mathematicians would be 
shocked were it to turn 
out that numbers like 
€(3) and e + 7 are not 
transcendental. 
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Thus a is transcendental if f(@) # 0 for every nonzero polynomial f(x) 
in Q/x]. A general, and unsolved, problem in the theory of transcendental 
numbers is to construct an algorithm that will determine, for a given classical 
function (x), whether 4(@) is transcendental on input an algebraic number 
a. Of course, one may compound classical functions and substitute algebraic 
numbers, pass to the algebraic closure of the field obtained by such proce- 
dures, and then begin all over again. For example, consider e®, where @ is a 
real root of x° + Jo (1)x* ~¢(eY?)x +1=0. Needless to say, mathematics does 
not have a method to handle the question of transcendence of such numbers. 

It isn’t necessary to consider such complicated numbers to give examples 
of classical numbers whose irrationality or rationality is still open. For exam- 
ple, although it is known that £(2) = 1+ = + 1. +++ is equal to a, which is 
known to be transcendental, the number £(3) = 1 + + + = + = +++ has been 
studied without success. No one knows even whether it is irrational, much 
less transcendental.' The irrationality issue for e + has also not been settled. 
This whole area is full of unanswered questions. 

But take heart—we will get some important answers in this chapter. 


(i) In Section 5.3, we will show that e is irrational. 
(ii) In Section 5.4, we will show that e” (n € Z) and z are irrational. 
(iii) In Section 5.5, we will show that e is transcendental. 


(iv) And the grand finale: in Section 5.6, we will show that z is transcenden- 
tal. 


To get going, we need some preliminary ideas, all useful in themselves. Here 
we go... 


5.1 Liouville’s Observation 


It isn’t easy to give explicit examples of classical transcendental numbers. 
There is, however, a simple observation that was made by Joseph Liouville 
(1809-1882) that allows us to write down nonalgebraic numbers. Liouville’s 
result has to do with the approximation of algebraic numbers by rational num- 
bers. Here it is. 


Theorem 5.1. Let € be an algebraic number with minimal polynomial of 
degree n > 2. Then if F is a rational number such that are < 1, then 


; _ e| > re where c is a constant depending only on &. 


In other words you can’t get too close to € with a rational number in the 
sense of the above inequality. 


Proof. By clearing the denominators in the minimal polynomial, we have 
f (€) = 0, where f(x) is an irreducible polynomial with integer coefficients: 


1 


nx" + dy_|x" +++ +d. 


‘Such was the case in 1972 when I took Ken Ireland’s summer course. But just six years later, 
completely out of the blue, the French mathematician Roger Apéry proved that (3) is irrational. 


5.1 Liouville’s Observation 


Since f has degree n > 1, we see that f (7) + O for every rational num- 
ber = (why?). Hence on substituting, we have 


(all> 


since the numerator is a nonzero integer. 
Since é is a root of f(x), we have 


f(x) = (x-€)g(x), 


where g(x) is a polynomial with complex coefficients. Consider the closed 
interval [é - 1,é + 1]. Since a continuous function on a closed interval is 
bounded (Lemma 4.1), we may choose M such that |g(x)| < M in that inter- 
val. Then for x in that interval, we have 


[f(x] = lx - elle) < lx - 1M. 


Anp" oe iip d Rae agq” > 1 


q” q” 


2 \FC)I 
Hence |x - €| > “G4. 
For rational x = 5 in our interval, we have 
AC 

‘ q 


2 : 
M Mq" 


eé 


: I : 
Putting c = 77, we are through. | 


As a corollary we produce a transcendental number. Consider the unclas- 
sical number 


1 1 1 


= —_ +—_ + —_ +... = 0.110001000000000000000001 ... . 
10!” 102! * 105! 


Qa 


Let us show that @ is transcendental. 
First of all, it isn’t rational, because the decimal digits never repeat. And 
furthermore, 


1 1 1 1 ‘ 
a 10 102! 10™! 1Q(m+1)! ° 
SO 
Pm 2 
| 10! = 10¢7+1)! : (3.)) 
where 
Pm 1 1 1 


=— + — +--+ —. 
10" 10 = 10?! 10”! 
Now suppose a were an algebraic number of degree n for some n > 2. 
Then for sufficiently large m, the Liouville inequality would imply 


Cc 


Pm 
Jgnm! 


(Gee 
10m! 


(5.2) 
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The transcendence of 

e was first proved by 
Charles Hermite in 1873, 
and it marks the beginning 
of the modern theory of 
transcendental numbers. 
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On combining inequality (5.2) with inequality (5.1), we get a sandwich, 


2 >| Pm |, ¢ 
19¢m+1)! e 10™! 1Qnm!’ 
or 
1oG"t!)!-nm! < . 
2 


which is false if m is big enough (recall that c and n are fixed!). 

Hence we have shown that a is transcendental. There is an added bonus to 
this argument. Besides giving a simple explicit example of a transcendental 
number, we can even exhibit uncountably many transcendentals. Although 
we know that the transcendentals are uncountable by counting the algebraic 
numbers and knowing that the reals are uncountable, this argument is unde- 
niably great. 

The same argument also works for 


+1 +1 +1 

10% * 702 * 103) 7 
where the + and — signs are sprinkled arbitrarily. Each such number is cre- 
ated by taking a subset of the positive integers and putting minus signs at 
those terms and plus signs everywhere else. Thus the number of such reals 
is the cardinality of the set of all subsets of the positive integers, which is an 
uncountable set. This gives us an uncountable set of nonalgebraic numbers. 


5.2 Gelfond—Schneider and 
Lindemann—Weierstrass 


The problem of exhibiting a “classical” transcendental number requires meth- 
ods more difficult than Liouville’s observation. In Section 5.5, we prove that 
e is transcendental. In this section, in order to give a little more perspective 
as to what is known, we discuss two big theorems—without proof, of course. 

There is no need to stop at e. A general result called the Lindemann— 
Weierstrass theorem (stated below) was proved in the early 1880s, and it 
implies that for every nonzero algebraic number a, the number e® is tran- 
scendental. Its proof requires techniques more advanced than those used to 
prove the transcendence of e. 

In order to state Lindemann—Weierstrass, we need a new piece of lan- 
guage. Notice first the relationship between transcendence and linear inde- 
pendence. The statement that a is transcendental is equivalent to the statement 
that for every n, the set of complex numbers { la,a’,.. na is linearl 
independent over Q. In this case, we say that the sequence 1, a, a’,a°,... is 
Q-linearly independent. The transcendence of e® then amounts to the require- 
ment that 


{ 1e%, 2", &,... } 


be Q-linearly independent. The Lindemann—Weierstrass theorem replaces the 
exponents 0, a, 2a, 3a,... by an arbitrary sequence of distinct algebraic num- 
bers and allows Q to be replaced by the field of all algebraic numbers. In other 
words, the theorem states the following. 


5.2 Gelfond{ Schneider and Lindemann{ Weierstrass 


Theorem 5.2 (Lindemann-Weierstrass). Let F be the field of all algebraic 
numbers in C. Let a,a2,... be a sequence of distinct algebraic numbers. 


Then the sequence e™, e®, e™ is F-linearly independent. 


Ca ee 

As a consequence, we see that 7 is transcendental. For we know by the 
above result that e® is transcendental when a is nonzero and algebraic. So 
if 7 were algebraic, then im would be algebraic, and then e’* = —1 would 
be transcendental, which is nonsense. But this is a tough way to show that z 
is transcendental. In Section 5.6, we will show that z isn’t algebraic using a 
proof by Ivan Niven (1915-1999) inspired by an 1883 paper by Adolf Hur- 
witz (the same Hurwitz of the sums of squares theorems in Chapter 2). 


Lookout Point 5.1. The seventh problem in Hilbert’s 1900 Paris address 
is about this very circle of ideas. Here is the text from his address: 


Irrationality and Transcendence of Certain Numbers. 


Hermite’s arithmetical theorems on the exponential function and their exten- 
sion by Lindemann are certain of the admiration of all generations of math- 
ematicians. Thus the task at once presents itself to penetrate further along 
the path here entered, as Hurwitz has already done in two interesting papers, 
“Ueber arithmetische Eigenschaften gewisser transzendenter Funktionen.” I 
should like, therefore, to sketch a class of problems which, in my opinion, 
should be attacked as here next in order. That certain special transcenden- 
tal functions, important in analysis, take algebraic values for certain alge- 
braic arguments, seems to us particularly remarkable and worthy of thorough 
investigation. Indeed, we expect transcendental functions to assume, in gen- 
eral, transcendental values for even algebraic arguments; and, although it is 
well known that there exist integral transcendental functions which even have 
rational values for all algebraic arguments, we shall still consider it highly 
probable that the exponential function e'”*, for example, which evidently has 
algebraic values for all rational arguments z, will on the other hand always 
take transcendental values for irrational algebraic values of the argument z. 
We can also give this statement a geometrical form, as follows: 

Tf, in an isosceles triangle, the ratio of the base angle to the angle at the 
vertex be algebraic but not rational, the ratio between base and side is always 
transcendental. 

In spite of the simplicity of this statement and of its similarity to the prob- 
lems solved by Hermite and Lindemann, we consider the proof of this theo- 
rem very difficult; as also the proof that 

The expression a®, for an algebraic base a and an irrational algebraic 
exponent B, e.g., the number 2V? or e* = i-?!, always represents a transcen- 
dental or at least an irrational number. 

It is certain that the solution of these and similar problems must lead us to 
entirely new methods and to a new insight into the nature of special irrational 
and transcendental numbers. 


Thirty-four years later, Alexander Osipovich Gelfond (1906-1968) solved 
the a? conjecture, to be followed a year later by an independent proof by 
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We allow a] = 0. 


A very clear proof of 
Theorem 5.2 can be found 
in Ivan Niven’s delightful 
introduction [rrational 
Numbers [61]. This little 
book should be in your 
mathematical library. 


Math. Annalen, vols. 22, 
32 (1883, 1888). 


Note that if @ is algebraic, 
a + O, then, again using 
Lindemann—Weierstrass, 
we see that e!® is 
transcendental. It follows 
that cosq@ and sina 

are transcendental, as 
expected. 


See Exercise 5.1. 
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The Gelfond—Schneider 
theorem is much deeper 
than the Lindemann— 
Weierstrass theorem. A 
very clear treatment by 
Einar Hille can be found 
in [38]. 


The assertion: if, in an 
isosceles triangle, the 
ratio of the base angle 

to the angle at the vertex 
be algebraic but not 
rational, the ratio between 
base and side is always 
transcendental. 
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Theodor Schneider (1911-1988). In order to clarify the result, recall that if 
a and £ are complex numbers, the number a, a € 0, is defined by B Ine 
where In is the natural logarithm function. But the logarithm is not single- 
valued. For example, e'™ = —-] and e! = —1, soiz and 377i are both values of 
In(-1). Thus we speak of the values of a8. Hence Gelfond—Schneider states 


the following. 


Theorem 5.3 (Gelfond-Schneider). /f a and B are algebraic, a + 0, and 
B¢Q then every value of a? is transcendental. 


Thus 3V5, Jee , and 2°, where s is a root of P+rxt1 = O, are all 
transcendental. But ie is an added surprise. We can even catch e”, which 
is a transcendental to a transcendental power. For consider the fun number i’. 
By definition, this represents the various numbers e'!"'. One value of Ini is 
In( 1?) = 5 In( 1) = 5mi. Hence e!!"! = e 
tal. From that you can quickly conclude that e” is transcendental. This is hard 
to keep straight. The fact that en = —1 with Lindemann gives the transcen- 
dence of z, while the fact that i’ has a value e~7/? proves the transcendence 
of e” via Gelfond—Schneider. Got it? 

Another important naturally occurring real number - me ite 
roni constant y, defined as the limit of the sequence {1+ 5 ae f+. — +14-In(n)}. 
It appears in the canonical decomposition of the sain fiichien t\ (x) that 
exhibits its poles: 


-7/2 Thus e~*/? is transcenden- 


T(x)7 t= x0] (1+ 2 en 


This constant “ought” to be transcendental, but in fact, it is unknown whether 
y is even irrational. 


Exercises 


5.1 How is the assertion made in Lookout Point 5.1 connected to Hilbert’s 
seventh problem? 


5.3 The Irrationality of e 


Two classical constants that you probably encountered before you delved very 
deeply into mathematics are e, the base for the natural logarithm, and 7. Let’s 
discuss these constants more thoroughly. It turns out that e is the easier of the 
two to handle. This is due to the fact that e is nicely expressed as an infinite 
series with excellent denominators. More precisely, 

1 1 1 

e=1l+l+—+—+—+4H+ 

2! 3! 4! 
we can use this to show that e is not a rational number. For suppose e— 1 were 
rational and write e — | = ao where p and q are integers, g + 0. Then when n 


5.4 The Irrationality of x and e (c € Z) 


is large (in fact n > q), the number n!(e — 1) is a positive integer. An estimate 
shows this to be impossible. Set A = Li-1 ti the sum of the first n terms in 
the power series expansion of e—1. We note that !A is an integer, since every 
term in the sum when multiplied by n! is an integer. Then we have 


O<ni(e-1)-nlA 
— 1 
=n! a 


1 1 
=n! + eos 
ae (5 (n+2)! 


ee ee ee ee 
(nt+1) (n+1)2 (n+1)3 n° 


Thus n!(e - 1) —n!A is a positive integer less than 1. which is impossible. 
That was not too bad. For the record... 


Theorem 5.4 (e is irrational). 


e¢Q. 


5.4 The Irrationality of z and e¢ (c € Z) 


In the case of z, an even more familiar constant to many, it is no longer 
a simple exercise to show that it is irrational. That it is irrational was first 
proved by Johann Heinrich Lambert (1728-1777) in 1761. Here is a very 
beautiful proof due to Ivan Niven (1946). The proof is related to an elegant 
treatment of the transcendence of e by Adolf Hurwitz in 1883. Hurwitz’s 
paper, in turn, was a response to a paper by David Hilbert (1862-1943) in 
which the gamma function is used to establish the nonalgebraic character 
of e. In all these papers, the basic technique comes from an 1873 proof by 
Charles Hermite (1822-1901). 

Assume that z is rational and write 2 = a/b, where a and D are integers. 
Then construct (of course) the function f defined by 


x"(a— bx)" 


F(x) = ai (5.3) 


We consider f(x) on the interval [0, 2], and we observe that f(x) = f(a-x). 
When n is large, the value of f(x) is uniformly very small. Now form the 
alternating sum of the even derivatives of f(x): 


F(a) = #2) -£O(@) +f R)—— 


Since f(x) has degree 2n, this is a finite sum with n terms. Then on differen- 
tiating twice, 


BaF) =F Gy OG) 2% 
and adding, we see that 


F(x) + F(x) = f(x). 
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Be sure to verify each step 
in this calculation. 


We'll work though 
Hurwitz’s proof later. 


Of course, the “(of 
course)” is facetious. 
Functions like this are 
created by studying many 
examples and looking for 
underlying structure. Also, 
we shall need to adjust n 
to suit our needs as that 
proof goes on. 
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Recall the definition of f: 


x"(a- bx)" 


n! 


F(x) = 


Exercise 5.2 asks you to 
fill in the details in the 
above argument. It’s a 
good idea to do this now. 


The “inspiration” that led 
to formula (5.3) was likely 
the result of a great deal of 
experiment and play. 
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Set G(x) = F’(x) sinx — F(x) cos x and notice that its derivative is 
F(x) sinx + F’(x) cos x — F’(x) cosx + F(x) sinx, 


which is (F(x) + F(x)) sinx, which is f(x) sin x. 
By the mean value theorem, we have 


G(x) - G(0) =af(€) siné (5.4) 


for some € € (0,7). What we shall show is that the left-hand side of this 
equality is an integer, while the right-hand side is a positive real number that 
is less than 1. 

Returning to the definition of f(x), we see that since the term of lowest 
degree is a"x"/n!, all the derivatives of order less than n vanish at x = 0, 
while all subsequent derivatives are integers at x = 0. But f(x) = f(a - x), 
so f(x) are also all integers. Using the fact that sin = 0, cosa = —1, we 
conclude that the left-hand side of (5.4) is integral. As for the right-hand side, 
we notice that | > sinx > 0 on (0,7), and for some constant M, we have 
0 < f(x) < M"/n!, which approaches zero as n gets large. Hence for large n, 
nf (€) sin€é is positive and less than 1. Since a positive integer cannot be less 
than 1, this proves the irrationality of 7. Done. Let’s celebrate this: 


Theorem 5.5 (z is irrational). 


néeQ. 


Notice that we have used only the facts that cos a = —-1, sina = 0, and 0 < 
sinx < 1 for 0 < x < a. These facts follow quickly from the characterization 
of mz as twice the first positive zero of cos x. Furthermore, the only results 
from calculus used were the mean value theorem and the derivatives of sin x 
and cos x. Hence the irrationality of z can be established quite early in the 
undergraduate program or even at the high-school calculus level. In view of 
the dominant role played by z in geometry and analysis, proving this result 
in beginning calculus seems like a good idea. 


Lookout Point 5.2. The arguments for the irrationality and ultimately the 
transcendence of our classical constants will increase in complexity over the 
rest of this chapter, but the methods underneath them all will be quite similar 
to what we just did for the irrationality of 2: assume that the numbers are 
rational (or, later, algebraic) and derive a contradiction by cooking up expres- 
sions (and this is the part that requires work, insight, and inspiration) that take 
on impossible values if our candidates are rational (or algebraic). For exam- 
ple, in the case of z above, the expression turned out to be integral and less 
than 1 at the hypothetical rational value, and that is a contradiction. 

Oh, and another device that will be useful in several places is a simple 
polynomial identity that should (and may be) part of every undergraduate 
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experience: Suppose that f(x) is a polynomial with coefficients in a field and 
F(x) is the (finite, really) sum of all the derivatives of f: 


F(x) = > FOC). 


Then 


F(x) - F(x) = f(x). 


5.4.1 Next up: e¢ 


Using a very similar technique to the one used in establishing the irrationality 
of 7, one can show in fact that e° is irrational for every nonzero integer c. For 


the proof, we take our basic function to be 
x"(1-x)" 
n! , 


f(x) = 


Form the alternating derived series with powers of c sprinkled about, this time 
using all derivatives. We have 


FQ) Se" Fle? FG) tee) @). 


Again we observe that since f(x) = f(1—.x), it follows that f“ (0) and 
f(A) are integers for all derivatives (check this). 
Form G(x) = e°*F (x) and calculate its derivative 


(5.5) 


G' (x) = e*F'(x) +ce™ F(x) 
= e*(F'(x) + cF(x)) 
= ec% ent] F(x) : 
Now use the mean value theorem on [0, 1] applied to the function G to obtain 
e° F(1) — F(0) = e©c7"*! £(0), 


where 0< 6 < 1. 

Up until now, we haven’t assumed anything about our number e°. We fin- 
ish the proof by contradiction. If e“ were rational, we could write e° = A/B 
for integers A, B. Then 


AF(1) — BF(0) = Be©™c?"*! £(6,). 


The left-hand side is a positive integer. As for the right-hand side, we see, by 
the definition of f(x), that for 0 < x < 1, we have 


veafGye. 
n! 


Hence 


Beo "1 £(9,,) < = 


which goes to zero as n > oo. This proves the result. 
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For example, “Using 
the fact that sinz = 0, 
cos 7 = —1, we conclude 
that the left-hand side 
of (5.4) is integral.” 
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Theorem 5.6 (Integer powers of e are irrational). [fc € Z, then 
e ¢€Q. 


It follows that e® is irrational for every rational nonzero a. And as a bonus, 
we can use Theorem 5.6 to show that the natural logarithm takes irrational 
values at integer arguments: 


Corollary 5.7. [fm « Z, then 


Inm¢éQ. 


The proof is up to you (Exercise 5.4). Of course, if we ask about the irra- 
tionality of log) 2, the answer is forthcoming if we suppose that log), 2 = 
p/q, which leads to an equality asserting that 2 to some integer power is equal 
to 5 to some integer power, which is impossible by the fundamental theorem 
of arithmetic. Fill in the details in Exercise 5.3. 


Exercises 


5.2 The proof of Theorem 5.5 makes quite a few claims. Write out a detailed 
proof of each one. 


5.3 Prove that log ;, 2 is irrational. 
5.4 Prove Corollary 5.7. 


5.5 Show that for every real number r, 


lim — =0. 


n-oco n' 


5.5 The Transcendence of e 


The fact that e” is not rational is equivalent to the fact that e does not satisfy 
a polynomial equation of the form x” —r = 0, where r is rational. Hence 
we have made some progress toward the transcendental character of e. It is a 
pleasant surprise that the same methods used to establish the irrationality of 
m and e” generalize to prove that e is transcendental. Here are the details. 

Suppose that e were algebraic. Write a polynomial relation for e with inte- 
ger coefficients and nonvanishing constant term: 


ajo +ajet+age- +--+ ane"=0, ag #0. 


In place of the little function ase used earlier, we consider a more com- 
plicated beast, the Hermite function 


2PM (e= I(x = 2(e=n)P 


f(x) = a 


(5.6) 
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Keep in mind that throughout the proof, 7 is fixed, while we will choose p _ In fact, n is the alleged 


3 . ‘ degree of the alleged 
to be an appropriate large prime. poimaiiialeauaned Wy 


Again sum all the derivatives, e, a polynomial that 
allegedly doesn’t exist (we 
hope). 


F(x) = 3 FO (x) = F(x) + FO (x) + FO(a) 45 
j=0 


where the oo sign is phony, since f(x) is a polynomial of degree np + p - 1, 
so that the sum is actually finite. 
Again, we have F’(x) - F(x) = —f (x), and therefore, 


(e*F(x))! =e*F' (x) -e* F(x) =-e* f(x). 
Now consider the intervals [0, 1], [0,2], [0,3], ..., [0,7] and apply the mean 
value theorem to the function e~* F(x) on each of these intervals. We have 
e'F(1)-F(0)=-e" f(:), 0<@ <1, 
e°F(2) - F(0) = -2e f (62), 0<6)<2, 


e-"F(n) — F(0) = -ne~™ f(On),  0<On <n, 


or equivalently, 
F(1) -eF(0) =e! f(@1) = 5, 
F(2) —e’F(0) = -2e7 ® f (02) = &, 


F(n) - e"F(0) = -ne""™ f (On) = On. 


Now multiply the jth equation by a;, the coefficient of e/ in the relation that 
we assumed e to satisfy. Noting that 


Gje =the" = ase" =+ = d,e" = @, 


we add and obtain 
a F (1) + a2F(2) +++: +a,F(n) + aoF (0) = 16) +++++ ann. (5.7) 
Again, the strategy is similar to that used in Section 5.4. We will show the 
following: 


(i) For large enough p, the left-hand side of equation (5.7) is an integer not 
divisible by p. 
(ii) As p > oo, the right-hand side goes to 0. 


Here we go... 
For the left-hand side, the Hermite function may be written in the form This is a finite sum. 


cox? cyxPt! 
+ + foe, 
(p-1)! @-1)! 


x __ 2) pt 
f(x) (p-1)! 


(5.8) 
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For i > p and all integers 
k, we have p | fk). 
Hold onto this. 


And ... the derivatives 
fO for0<sisp-l 
vanish at 1,2,...,n. 
Hold onto this. 


It all hinges on our hope 
that p + agF (0). 


Choose p so large that 
p t (n!) and p + ap. 
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where the c; are integers. You can check (Exercise 5.6) that when i > p, the 
coefficients of f © are all integers divisible by p. Hence for i > p and all 
integers k, we have p | f“!)(k). 

Now look at the definition of f in Section 5.6 to see that f has a root of 
multiplicity p at 1,2,...,n. Hence all the derivatives f fori =0,..., p-1l 
vanish at 1,2,...,n 

Remember F? 


F(x) = S° ¢(x) = f(x) f(x) fP D(x) 4 f(x) fO MG) H+ 


j=0 
(5.9) 
Put the two side notes together to see that F(k) is an integer multiple of p for 

k=1,...,n 
This looks like trouble, because we want to show that for large enough p, 
the left-hand side of equation (5.7) is an integer not divisible by p. But we 
have just shown that all the terms but the last in that left-hand side of (5.7) are 
divisible by p. Everything now hinges on the nature of the last term: agF (0). 
Look again at equation (5.8) to see that f has a root of multiplicity p — 1 

at x = 0. So 


fO)=FO O) =F" O)se-= 7? 70) =0 


And as the side note reminds us, f®(0) is divisible by p for i > p. So on 
canceling everything that is divisible by p (including 0), we have 


F(0) = {OEP OTE. + f0-? (0) + MO PLM OT 


(5.10) 


So now everything depends oni = p — 1. That is, what can we say about 
f?-D (0)? 
Looking at (5.8), we see that 


f(x) = 


Mae higher-degree terms, 


Po 
=n Th 

so we keep whittling it down to obtain 
FP) 0) = +(n!). 


So far, we haven’t restricted the prime p. But choose it now so that p + (n!) 
and p + ao. Bingo: We can now conclude with no further ado that 


a, F (1) + ayF (2) +--+ a,F(n) + agF (0) (5.11) 


is an integer not divisible by p. This is half of what we want to show. 
The other half is that the right-hand side of our favorite equation, 


a F (1) +a2F(2) +++: +ay,F(n) + agF (0) = a6, +---+Gn6n, (5.12) 
goes to 0 as p > oo. To see this, recall that 


—e'-% P16, — 1)? (0; —2)?---(0; —n)P 
(p-1)! 


-e" f (1) = 


5.6 a Is Transcendental 


Now, recalling that 0 < 6; <i <n, we have 
er 2 e”, gp! <nPle n?, (0;-1)?(6; —2)?---(6; —n)? < (n!)?. 


Sooo... 


e892 (9; — 1)? (8; - 2)? -(8; — 0)? a 
(p-1)! (pad) 


Recalling now that n is fixed, we see that the right-hand side of the equality 
above goes to 0 as p > oo (Exercise 5.5). So we can choose p so large that 


6;| = 


|a, 6, +054 y,6y| <1. 


But the left-hand side of (5.12) is an integer. So the right-hand side must be 0. 
But p doesn’t divide the left-hand side. It’s all impossible. 

And that does it. If e were algebraic, it would have to be a positive integer 
less than 1 not divisible by p, and that’s crazy. So we have (applause) the 
following theorem. 


Theorem 5.8 (e is not algebraic). e is transcendental. 


Except for staring intently at a more complicated polynomial, the idea 
is the same as the proof of the irrationality of 7. We were concerned with 
e-* f(x) instead of f(x) sin x. A very instructive exercise is to carry out the 
proof explicitly for n = | and 2. This gives a proof that e is irrational, which 
is a very simple fact, and a proof that e is neither a rational number nor a 
quadratic irrationality, a less simple fact. Of course, living with the general 
proof is most satisfying of all. 


Exercises 


5.6 Show that if h(x) € Z[x] and p is a prime, then for i > p, 


d' ( h(x) 
rate 


is a polynomial in Z[x] whose coefficients are all congruent to 0 mod p. 


5.7 If € € Q, show that e® is transcendental. 


5.6 z Is Transcendental 


We showed earlier that z is irrational. In this section we will establish, by a 
nontrivial extension of the basic idea underlying the irrationality proof, that 
m is not an algebraic number. The proof we give is due to Ivan Niven and was 
published in 1939 in the American Mathematical Monthly. 
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See Lookout Point 5.2 for 
the basic idea. 
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5.6.1 More About Symmetric Functions 


In our proof we need a corollary to the symmetric function theorem (Theo- 
rem 4.10 in Section 4.4): 


Lemma 5.9. Consider the following polynomial in Z| x]: 


f(x) =x" + yx” +++ + ay = (x - 01)-(4 - On). 


If g(x1,...,Xn) is a symmetric polynomial in variables x1, ..., Xp, with coeffi- 
cients in Z, then g(01, 02,...,9n) is an integer. 

Proof. By the symmetric function theorem, g(x1,...,%n) = A(O1,...,0n) € 
Z[o1,.--,0n], where o1,...,0 are the elementary symmetric functions in 
X1,...,Xn. Now substitute (6),...,0,) for (x1,...,%,). The o; evaluated at 
61,..., 9, are precisely the integers aj,...,d@, up to a + sign. Hence the result. 
r 


In the application we have in mind, the values of @ that arise satisfy poly- 
nomials in Z[x] that are not monic. But by a change of variable, we can 
transform any polynomial in Z[ x] into a monic one. If 


An x” + dn_1x |) +++ +a9 = an(x — 01)+-(x - On), 


then 
a Cre Sige not ay) =an(x-61)--(x - On), 
so 
n n-1 2 n-2 n-1 
(anx)" + Andn-1(anx)" + a5,dn-2(anx)" +++ +a ao 
= (aay eal (ty eel olay teas, 
a (anx ~ an 91) (anx = Gn O2)+*(Anx = anOn) ’ 
where a’,-1,...,@’o are integers. It follows that if g(x1,...,%,) is a sym- 
metric polynomial with coefficients in Z and leading coefficient a,, then 
g(4)9},..-,4,0,) is in Z. In particular, if g(x1,...,%,) is homogeneous of 
degree s, which means that 
g(txy,...,t%n) =t'g(xX1,..-,Xn), 
then af g(6),...,@,) is in Z. We will make use of this observation later in this 
section, so let us state it as a lemma. 
Lemma 5.10. Suppose 
nx” + Any x | +++ + a9 = an (x — 01)--(X - On), 
where the coefficients are in Z. If g(x1,...,Xn) is a symmetric polynomial 
with coefficients in Z and leading coefficient ay, then g(an1,.--,4n9n) is in 
Z. In particular, if g(x41,...,Xn) is homogeneous of degree s, then we also 


have that a‘g(6,...,9n) is in Z. 


5.6 a Is Transcendental 


5.6.2 Euler’s Identity 


We now prove that z is transcendental. The basic strategy is similar to that 
used for e: assuming that a is algebraic, we construct a nonzero integer that 
is of absolute value less than 1. 

The function that plays the central role is somewhat more complicated 
than the one used for e. Here is how we define it. If 7 were algebraic, then iz 
would also be algebraic. Recall that e* is a complex-valued function of the 
complex variable x. We have 

2 
fs sge te, 
2! 


and this holds for complex x. Substituting x = it, where ¢ is real, we have 
242.343 
; it Ut 
e?=1+it+ —+——+ 
2! 3! 


and on rearranging the (absolutely convergent!) series, we obtain 
’ ae Pr. F i 
e =l-~—+—+4H---+i[ft-—+—+--]=cost+isint. 
2! A! 3! 3! 


On substituting ¢ = 7, we obtain 


5.6.3 Setting the Stage 


This is our beginning point. Suppose that iz = aq satisfies an irreducible 


polynomial over Q with roots a1, @2,...,@,. Then since e@! + 1 = 0, we have 
(e" +1) (e? +1)--(e" +1) =0. (5.13) 
Multiplying out yields 


14 eM $e $$ OOM 4 EMF 4 oF COM NFAM 4 PMFIMEG 4, 


$+ EF ODF FAH | 
The exponents are the various sums of subsets of {a,...,@,}. There are 
2” — 1 such numbers (why?). Call them p1,..., es, where s = 2-1, and form 
the polynomial that has them as roots: []j_;(x — pi) = g(x). Then g(x) € 
Q| x]. This follows from the symmetric function theorem and the observation 
that g(x) = Jj, 4;(*), where h;(x) is the polynomial whose roots are the 
various sums of j of the a’s. Since the set of sums taken j at a time is invariant 
under the various permutations of a1,...,@n, it follows that h;(x) € Q[x], 
and from this it follows that g(x) € Q/x]. 

On clearing the denominators, we may now assume that g(x) has integer 
coefficients. There is one more comment: It may happen that various sums of 


131 


See Lookout Point 5.2. 


See formula 2.1 in 
Section 2.1. 


This is a very famous 
and celebrated formula. 
Google “Euler’s Identity.” 
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A worthwhile timeout: 
count the number of 
times you have seen a 
calculation like this. 


“carries over without dif- 
ficulty...”” See Lookout 
Point 5.3. 
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the a’s are 0, and that will contribute a factor of x to g(x). Cancel them out 
so that we may write 


a(x) = cx" +eyx" 1! +---4+0e9, (5.14) 

where c,c\,...,Co are in Z and co # 0. Furthermore, now the roots of a(x) 
are the nonzero roots of g(x). Call these roots 6, ...,8,, so that 

a(x) = c(x - Bi) (x - Ba)---(x - Br). (5.15) 


Notice also that from equation (5.13), we have 
oP 4 2 4... 4 er +k =0, (5.16) 


where k is an integer arising from the various zero exponents. 


5.6.4 And Now... the Proof 


With these preliminaries out of the way, we can construct the analogue of the 
Hermite beast. We define f(x) by 


_ xP H(a(x))? _ e™P xP“! (Tau - Bi)? 
(p-1)! (p- 1)! 


where m is any fixed integer greater than rp, the degree of a(x)’. Now we 
proceed as earlier. Form 


F(x) = f(x) + f(x) + fOG)+—, 


Fiz)=F APO @) ees 
F(x) - F(x) = f(x). 


f(x) 


; (5.17) 


Hence, we have (as usual) 
(e*F(x))’ =e *F'(x)-e *F(x)=e* (F’(x) - F(x)) =-e “f(x). 


The various 8; are complex numbers. Draw line segments to the various £; 
in the complex plane as in Figure 5.1. These segments play the role of closed 
intervals in calculus, so that we can write i '-e™* f(x) dx and so on. 

The function e~* f(x) is a complex-valued continuous function of the 
complex variable x. Thus we may integrate e~* f(x) along segments, and the 
fundamental theorem of calculus carries over without difficulty. The deriva- 
tive of e-* F(x) is —e~* f(x) (in the complex sense), and so we obtain, for 
the various £;, 


Bi 
[0 -e*F(a)dx = ePF(B;) - FO). 
And then 


Bi a —~e™* f (x)dx = F(Bi) — eB F(0). 
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9) 


3 


Figure 5.1. The segments to the various /;. 


Summing over i = 1,...,7 and using equation (5.13), we obtain the basic 
relation 


F(B) +++: + F(B,) + KF (0) = >(- i 2 f(x)dx] . (6.18) 


The remainder of the proof consists in showing (as is our custom) that for a 
suitable prime p, the left-hand side is a nonzero integer, while the right-hand 
side is less than 1| in absolute value. 

Let us handle the left-hand side first, and begin with kF(0). Recall the 
definition of f from equation (5.17): 


en ale) iP c™ xP! (ex + cyx7 pet co)” 


f(x) = 
(p-1)! (p-1)! 
We see that at 0, all derivatives up to the (p — 2)nd are 0. The (p - 1)st 
derivative at 0 is c’"*? cf’, and all subsequent derivatives are integers divisible 


by p. Thus if we choose the prime p so large that p + cco, then it won’t divide 
c™*P cP and thus f(0) + f’(0) + f (0) +--- must be an integer not divisible 
by p. Hence for p sufficiently large, kF (0) is not divisible by p. 

Now we examine the integrality of F(6,) +--- + F(,). First, each 6; is a 
root of a(x) (see equation (5.13)), so each G; is a root of f(x) of multiplic- 
ity p. This implies that 


f (8) =0 forj=1...,p-1, 
and this implies that for j = 1,...,p— 1, we have 
FY) (B,) + FO (By) +--+ FO (B,) = 0. (5.19) 


Hence f)(B1) + f (Bo) +--+ + f (B,) = 0 fort = 1,2,...,p-1. 
Now consider the above sum for tf = p,p+1,...: 


FO (Bi) ++ f(B,). 
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Recall that f(x) = 
c™xP—!(a(x))P 


(p-1)! 
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First of all, f (t )(x) in this range of ¢ has integer coefficients divisible by 
p, and furthermore, each coefficient is divisible by c’”. The degree of f(x) 
is rp + p — 1. Therefore, after p — 1 differentiations, it has degree rp = s. 
Since m > rp, we conclude that pc* divides each coefficient of f ¢ )(x) for 
t=p,pr+l,.... 

Now substitute ; into f(x), giving f((B;). This quantity is a poly- 
nomial in 6; each of whose coefficients is divisible by pce*. 

Now, f)(B1)+--:+f (B,) is symmetric in B},..., 8-, and furthermore, 
it is a sum of homogeneous symmetric polynomials each of degree at most s. 
Hence by Lemma 5.10, we see that f“)(8,) +--- + f(B;) is an integer 
divisible by p. It follows that F(81) +---+F(G,) is an integer divisible by p. 
Finally, we conclude that the left-hand side of equation (5.18) is an integer 
not divisible by the prime p. Hence it is not zero. Whew! 

In order to estimate the right-hand side of equation (5.18), we use a stan- 
dard estimate for f, h(x)dx, where L is a path in the complex plane and h(x) 
is a continuous complex-valued function on L. If |L| denotes the length of L, 


then the estimate is 
h(x)d. 
| - (x)dx 


where M is the maximum (least upper bound) of |(x)| on L. That the right- 
hand side is less than | in absolute value for some p follows just as in all the 
other proofs. It boils down to Exercise 5.5: 


<|L|M, (5.20) 


RP! 
lim ——— = 
p> (p-1)! 
The proof is finished by observing that p had to be chosen big enough to 
satisfy a (small) and finite number of conditions. Hooray! 


Lookout Point 5.3. Don’t consider the introduction of the complex line 
integral in this proof as a serious violation of the request for simplicity and 
elementary technique. The integral is defined simply in terms of ordinary 
Riemann integrals. More precisely, if (¢(t), (t)) parametrizes an arc Tas t 
goes from 0 to | and if f(z) = u(x, y) +iv(x, y), then one defines 


1 
[t@ae= [wou - r(6.w)w (at 
+i [ vbwdl +u(duu'(o at. 
This follows formally from 
f(z) dz = (ut+iv)(dx+idy) =udx-vdy+i(vdx+udy). 


Suppose that F(z) = U(x, y) +iV(x, y) satisfies the Cauchy—Riemann equa- 
tions 


0U OV OU OV 


, 5.21 
Ox Oy Oy Ox Can 
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Write F(z) = U, +iV. Itis shown in every text on complex analysis that this 
definition is consistent with the definition in terms of differential quotients. 
For our functions e~* F(x), it is a simple exercise. Then F’(z) = f(z) isa 
simple application of the definition and the ordinary fundamental theorem of 
calculus which states that 


fl f@)dz= FCB) - F(a), 


where I begins at a and ends at f. Finally, by referring everything to Riemann 
sums, it is possible to derive the basic estimate |, f(z)dz| < |L|max|f(z)]. 
For a proof of the transcendence of z that does not use calculus, see Hardy 
and Wright [35]. But the use of complex integration theory is essential to the 
modern theory of transcendental numbers. 
One more thing: It is a good idea to have a simple infinite series for , just 
as we had one for e. Such a series is given by the Leibnitz—Gregory series 


1 111 
sis2% a 
4 °° 3 4 


A short proof of this identity has been noted by Donat Kazarinoff [44]. Con- 
sider 


m/4 
oe tan” xdx, n>2. 
0 


Since tan x is between 0 and 1 on [0, 2/4], we see that a, is a monotonically 
decreasing sequence. Suppose a, > a. Now 


n/4 > 5 m/4 5 > 
Qn = f[ tan” “ x tan’ x dx = f[ tan” ~ x(sec* x — 1)dx 
0 0 


m/4 n/4 
= i tan”? x sec? x dx — tan”? x dx 
0 0 


a m/4 
tan”! / 1 
QAn-2 = Qn-2- 
n—-1 f n =] n 
Thus a, + @y_2 = ae 


Letting n — oo, we see that a + a = 0, and therefore a = 0. Now replace n 
by 27 and use the recurrence relation 


We have 
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Since @, — 0, we have proved the desired formula. 
If we begin instead with a2,41;, we get another nice formula: 
1 1 


Wn+1 spre te — “Fy, 
” 2n 2n-2 2 


But 
nm/4 m/4 1 
a= [ tanx dx = log sec | =Inv2=5In2. 
0 0 


In case you didn’t notice, Thus 
this is Dialing In 131. 


In2 J bd ge 
sine Winds 
2 4 6 or 
Since @2,+1 > 0, we have 
Te ee eee 
2 3 4 


Exercises 


5.8 Show that if g(z) € C[z], then e~*g(z) satisfies the Cauchy—Riemann 
equations (5.21). 


5.9 Suppose that f and g are polynomials in C[x] and n is a nonnegative 
integer. Show that 


n 


(i @)= > (eM @e@). 


k=0 


5.10 Take It Further. The proofs in this chapter involved concocting certain 
functions that were at the core of the results: 


(i) f(x) = a is used in the proof that z is irrational. 
(ii) f(x) = “Cay is used in the proof that e° is irrational. 
(iii) 
== 2) ean) P 
(p-1)! 


is used in the proof that e is transcendental. 


f(x) = 


(iv) In the proof that z is transcendental, we used this: 
aa MN alay)P ce Pa ea ple 
(p-1)! (p-1)! 


What are some structural similarities among these beasts? How are their 
definitions related to the results in which they are used? 


f(x) = 


5.11 Show that a rectangle is determined by its perimeter and area. 

5.12 Show that a rectangular box is is determined by its edge perimeter, sur- 
face area, and volume. 

5.13 Find two different rectangular boxes with the same edge perimeter and 
volume. Can you find two such that have rational side lengths? 


®) 


Check for 
updates 


Fourier Series and Gauss Sums 


The basic functions that we will be concerned with in this chapter are 
xtsinnx and x cosnx 


for n = 0,1,2,3,.... The elements of the vector spaces over R spanned by 
these functions are called finite trigonometric series. Thus a finite trigono- 
metric series of degree at most n is a function of the form 
f(x) =co +c, sinx + dj cosx + cz sin2x + dycos2x ++ 4 
+ cy sinnx + dy, cosnx, 6.) 

where co, C1,.--,Cn, d1,..-, dp are in R. For example, | + cos 2x is a trigono- 
metric series that defines the same function as 2 cos” x. Similarly, the function And there’s more to come: 
4 cos? x is the trigonometric series 3 cos x + cos 3x. see eee 

If one is given a function f that can be expressed as a finite trigonometric 
series, is it possible to express the coefficients co, c1,d1,...,Cn, dn in terms 
of f? 

This is where the chapter will begin, showing that there are indeed explicit 
expressions for the coefficients in a finite trigonometric series in terms of the 
function so expressed. And there’s more: we shall investigate the extent to 
which an arbitrary function (with some restrictions) can be expressed as a 
(possibly infinite) trigonometric series called its Fourier series. Fourier series 
will lead us into a wonderful garden of formulas that involve trigonometric 
functions and to a generalization of the “Gauss sum” that we met way back 
in Section 2.2. 


6.1 The Fourier Series of a Differentiable 
Function and ¢(2) 


This chapter opened with a question: 


If one is given a finite trigonometric series f(x) of the above form, is it 
possible to express the coefficients co, cj, d1,...,Cn,d, in terms of the 
function f? 


The answer is supplied by recalling the basic “orthogonality relations” 
between the functions sinnx and cosnx. Namely, if n and m are nonnega- 
tive integers, then 
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m ifn =m #0, 
(i) < cosnx cos mx dx = ‘i ; — 
-" 0 ifntm; 


(ii) i. sinnx cos mx dx = 0 for all n,m; 


n ifn =m #0, 
Gi f[ sinngsinmedxy-= 1" 7" ™ 
-n 0 ifntm. 


In order to verify these formulas, recall the formulas 
2 cosnx cosmx = cos(n + m)x + cos(n —m)x, 
2 sinnx sinmx = cos(n — m)x — cos(n+m)x, 
2 sinnx cos mx = sin(n + m)x + sin(n-m)x. 


To check equality (i), for example, we use the first identity: If n = m, then 
2cos? nx =cos2nx + 1, so that 


nt nt nt in 2 n 
2f costnxdx = fl cos nx dx + f Lede = = +27 =0+27, 
-—1 —1 —1 n 


TT 


or equivalently, 


TT 
i: cos*nx dx = 7. 


Tw 


If n + m, then 
: nu a mu 
w sin(n + m)x sin(n — m)x 
2 f cos nx cosmx dx = ( ) ( ) =0. 
-n (n+m) (f=). 
Exercise 6.3. The remaining orthogonality relations are left as an (important!) exercise. 


In order to answer our question about expressing the c’s and d’s in terms 
of f(x), we proceed as follows. To get co, we simply integrate both sides of 
equation (6.1) from —z to z. That gives 


[lt dx = 210, 


or equivalently, 


1 Tt 
o=— f[ f (x) dx. 
Qn J-n 
For c, when n > 0, multiply by sinnx and integrate from —z to 7, obtaining 
Tt nu 
i f(x) sinnx dx = Cy [ sin’ nx dx = Cpt, 


TT 


or equivalently, 


a= f f(x) sinnx dx. (6.2) 
NM J-n 
Similarly, 
1 uw 
dn=— f f(x) cosnx dx. (6.3) 
NM J-n 
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In this way, we have expressed the coefficients c, and d,, in terms of the 
integrals of f multiplied by sinnx and cos nx, the range of integration being 
the interval from —7z to z. As a simple example, taking n = 3 and using the 
identity 4cos* x = 3cos x + cos 3x, a computation shows that that 


[00s x cos 3x dx = e 
-1 4 

If f(x) is any function integrable on [-z, zr] (say, continuous or piecewise 
continuous), then the above observations motivate the definition of a series of 
constants called the Fourier coefficients of f. They are defined by the follow- 
ing formulas: 


1 ue 
an=— f f(x)cosnxdx, n=0,1,..., 


i . (6.4) 
bn=— f f(x)sinnx dx, n=1,2,.... 
NM JS-7 


Notice that the a, begin with 


1 uw 
ag = — } f(x) dx, 
MT JS-7 
while the b,, begin with bj, bo, .... Formulas (6.2) and (6.3) can now be stated 
like this: 


Lemma 6.1. Jf f(x) is a finite trigonometric series of degree n, 
ao i ; 
f(x) = > f ys a, coskx + by sinkx, 
k=1 


then do, @1,..-,n,b1,..., Dn are the Fourier coefficients of f. 


If f(x) is not a finite trigonometric series, then the Fourier coefficients 
a, and b, need not be zero for n large. But it is natural to ask whether the 
resulting infinite series 

ag 


a >> (az cos kx + by sin kx) (6.5) 
k=1 


converges for x € (—7, 7) and whether, if it does, it equals f(x). When it does 
equal f(x), we call it the Fourier series of f. 

This is the basic problem in the elementary theory of Fourier series. A 
partial solution to the problem was supplied by Johann Peter Gustav Lejeune 
Dirichlet (1805-1859) in 1829 when he proved that if f(x) has a derivative 
at xo in (—7, 7), then indeed, the series of numbers 

ao 


co 
> + > dn COS NX9 + by Sinnxo 


n=1 


converges to the number f(x). We will prove this result in the next section. 
But there is a special case of a lemma due to Riemann that implies that the 
infinite sum above has a chance of converging to something. 
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Keep in mind that the an 
belong to the “cosine” 
coefficients and the by, 
belong to the “sine” 
coefficients. 
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For a discussion and proof 
of this situation (and 
more), see Casper Goff- 
man’s Real Analysis [85]. 
And, as you’ll soon see, 
“elementary” is not the 
same as “simple.” 
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Lemma 6.2 (Riemann). /f f(x) is continuous on [a,b] and has a contin- 
uous derivative on (a,b), then the Fourier coefficients of f go to zero as n 
approaches infinity. In symbols, 


bn=— f f(x)sinnxdx +0 asn>o, 
1 nT 
an=— f f(x) cosnxdx +0 asn—> oo. 


Proof. On integrating by parts, we have 


TT 


[tf sinnx dx iC) ee ie. [Pro cosnx dx. 
= nNdI-n 


ud n 


ca 


Each term goes to 0 as n > oo. The proof of the second statement is similar. 
a 


Lookout Point 6.1. Actually, the above lemma holds for every function 
f (x) that has an integral on [a, b]. Recall that this condition imposes restric- 
tions on f. It turns out that the requirement of (Riemann) integrability is 
equivalent to the requirement that the set of discontinuities have “measure” 
zero. In particular, a function with a countable set of discontinuities has a 
Riemann integral. (As an example of a function that does not have a Rie- 
mann integral, consider the so-called Dirichlet function, which equals | for 
x rational and 0 for x irrational. It does not have a Riemann integral on any 
finite interval because it is discontinuous everywhere.) Although we will use 
only the case in which f(x) is composed of a finite number of continuously 
differentiable pieces (piecewise C!), let us show how the general Riemann 
lemma follows from an important inequality due to Bessel. The proof of 
Bessel’s inequality (up next) is actually more elementary than the proof of 
Lemma 6.2, because it doesn’t use integration by parts. However, the basic 
idea is sophisticated, since it asks for the “mean square” approximation of 
f (x) by the partial sums of its Fourier series. More precisely, we prove the 
following lemma. 


Lemma 6.3 (Bessel’s inequality). Jf f(x) is integrable on |-1, 7], then 


where the ay, and by are the Fourier coefficients of f (x). 
Proof. Consider (of course) 
2 
a a < 
O< fi (re - (4 + >> am cosmx + by sins) dx. 
a m=1 


The integrand is a (complicated) square of a binomial, so we pick it apart. 
Note first that if we apply the orthogonality relations from Section 6.1, we 
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find that (check this) 
Tt n 2 n 
[ (3: en cose by sins] dx=n )a,,+b,,. 
~® \m=1 m=1 


Next, check that 


am i. f(x)cosmxdx=naz, and by f (x) sinmx dx = mb?,. 


Using this and the rule for squaring a sum, we have 
2 
O< 7. (10) (Ge 2 din cos + by sins) dx 
a 24 cae ao 
-[ fayra HD ae $B 2-2 f" P(a)P ax 


sia f(x) sin mx dx - 2>' am fi f (x) cos mx dx 


Hence 


We did it! 7 


For each n, consider the partial sums in Bessel’s inequality summed up 
to n. Since the left-hand side is positive, it gives a bounded monotonically 
increasing sequence. Thus the limit exists, which is, of course, the infinite 
series 


a — 2 2 
—+ ye Cm + by 7 
2 m=1 


Furthermore, the series is bounded by + I” f(x)? dx 
Hence 


2 oo) n 
a6 > Cs + by) < : i f (x)? dx 

2 m=l1 TM S-1 
This can be viewed as a weak Pythagorean theorem. The sum of the squares 
of the “components” or “sides” of f(x) is at most the square of the 
“hypotenuse.” It turns out, although it is harder to prove, that for a large 
class of functions (including continuous ones), equality holds. That equal- 
ity is called Parseval’s theorem. The abstract development of the allusion 
to geometry is called the theory of Hilbert and Banach spaces and is part 
of a large branch of contemporary mathematics called functional analysis. 
A pleasant essay on the development of these ideas in the first part of the 
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The checks are yours in 
Exercise 6.5. 
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Small world: This series 
is (2), where ¢ is the 
Riemann zeta function 
defined in Section 3.4. 


This proof was pointed out 
by D. Giesy [30]. 


This is even true for 

x = O if one defines the 

right-hand side at 0 to be 
its continuous extension 

at 0, namely n + 5 (use 

1’ Hospital). 


sin(a + B) = cosa sing 
+sinacosp 
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twentieth century can be found in Hermann Weyl’s article “A Half Century 
of Mathematics” [87]. 

Oh, and by the way, Bessel’s inequality implies Lemma 6.2 (this is Exer- 
cise 6.6). 


Before we prove the general result on Fourier series, let’s do a calculation 
that contains the basic strategy and that delights many people who see it for 
the first time. 


Theorem 6.4. 

m =1+ ! + : + : + 

6 22 32 4 
Proof. We need the following identity, which will be used any number of 
times during this chapter, so let’s make it a lemma. | 


Lemma 6.5. For x € [—7, 7], we have the equality 


sin (n+ 5)x 


* x 
2sin 5 


1 
BP O0Sx + C08 2x +-+--+Cosnx = (6.6) 


Proof. To see this, simply multiply both sides by 2 sin 5 to obtain 
sin ~ +2sin ~ cos x + 2sin ~ cos2x +--+ +2sin ~cosnx. 
2 2 2 2 


Using 2 sin Acos B = sin(A+ B) +sin(A — B), we find that this sum nicely 
telescopes (always a good sign): 


_ x ( me, me ( me a, 
sin — + | sin =x - sin —x) + [sin x - sin —x]+--- 
2 2 2 


2 2 
at) a) 
+{sin{n+—])x-sin{n--]x], 
2 2 
which is simply sin (n + 5) x, as desired. | 


Back at the ranch, to prove Theorem 6.4, multiply both sides of equa- 
tion (6.6) by x and integrate from 0 to 7. We obtain 


2 


oa f xeosxdx+ f xeosdxdx te f x cos nx dx 
4 0 0 0 
1 (6.7) 


ii xsin (n+ 2) 
= Xe 
0 2 sin > 


Let’s take up the right-hand side first and show that it approaches 0 as n goes 
to infinity. Using the addition formula for sine, we see that the numerator of 
the integrand splits, so that 


a 1 x 
a xsin(n+5)x ™ XCOS > n 
f[ ( 2) ax = f 2 sinnxdx + [ 5 cosnx dix (6.8) 
0 0 


s x * x 
2 sin 5 2 sin 5 


xn =x x Lae 
2 A 
=i - zoos Ssinnxdx +f —cosnxdx. 
0 sins 2 0 2 
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Both integrals in (6.8) satisfy the conditions of Riemann’s lemma (Lemma 6.2), 
since 5 /sin + is continuous on [0, 2] and goes to 1 as x + 0, so applying Rie- 
mann, we see that 
: 1 
x xsin(n+5)x 
lim i ad dx 
noo JQ 2 sin 5 


™ X COS 


x 
oy 7 
‘ : : x 
= lim - 2 sinnx dx + lim —cosnx dx =0. 
nooo JQ 2sin> n>coo JQ 2 


On to the left-hand side of equation (6.7). This is a little easier: 


n i k 
if seokeaes = . 
0 k 


T 


7 
-= sin kx dx 
0 


which is 0 if k is even and —2/k? if k is odd. Therefore, replacing n by 2n +1, 
we obtain 


? a xsin(n+2 
e 2 ieee ee ; =e) 7 Mi a 
4 32. (52 (2n +1)? 0 2 sin 5 


Now letting n > oo, we have, by the same reasoning as above, 


—=l+—5+—5+°. 
8 32 52 


A little juggling gives the desired result. (The one just proved is already inter- 
esting.) And we also have 


1 1 1 “( 1 1 1 ) 
+ totes 1+—+ +—rter]. 


2 4 6 4 22 32 4 
Also, 
1 1 1 
ltata see] + mip 
is equal to 
1 1 1 
Litto Yas! go 


on the one hand, and to 


Petite de) 
8 4 22 32 


on the other. Equating the two results yields 


8 4 
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And it reinforces the 
belief that questions about 
integers are questions 
about real things. 


Think for a minute. 
Why is this assumption 
reasonable? 


1- =v ? Things are heating 
up./ 
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Lookout Point 6.2. The ubiquitous 27/6. What is the probability that 
two integers chosen at random are relatively prime? In spite of the fact that 
the notion of “a randomly chosen integer” is a strange one, there is at least 
an intuitive idea of what the question means. Indeed, if we set up an exper- 
iment that picks two random integers and calculates their greatest common 
divisor and we repeat this thousands of times, then the ratio of the number of 
relatively prime pairs to the number of trials is a good approximation to the 
answer to our question. Of course, we are only picking numbers between 1 
and 100,000 (or whatever), and that is precisely the difference between what 
we can do experimentally and what it might mean to choose an integer at 
random from all of Z. 

To get a feel for the situation, you can set up an experiment in your favorite 
computational environment, run it many times, and see whether the results 
seem to cluster around a particular value. Try it—it’s fun. And if you run the 
experiment many times, you will see that it almost always outputs a number 
close to 0.6. What is that trying to tell us? Put the computer away, and let’s 
settle on a meaning for our question. 

Let us assume that if p is a prime, the probability that an integer chosen at 
random is divisible by p is = Thus the probability that an integer is even is 
5. the probability that an integer is divisible by 3 is i and so on. 

It follows from this assumption that the probability that two integers cho- 
sen at random are both divisible by p is oF So the probability that two integers 


chosen at random are not both divisible by p is 1 — 1/p?. 
Under these assumptions, we see that the probability that two randomly 
chosen integers are not both divisible by 2, 3, 5, 7, 11, or 13 is 


(-2)0-2)0-2)0-)0- BJO) os 


Perhaps you can now see what is happening here—we have seen a product 
like this before. With a (small) leap of faith, we can pass to the limit: a and 
b are relatively prime if they are not both divisible by any prime. So, the 
probability that two integers a and b are relatively prime is 


where the product is over all primes p. 
Oh, my! Remember equation (3.6) in Section 3.4? Just in case, here it is: 


jit 44 en [] = 


It follows that our favorite product can be expressed in terms of the Riemann 


zeta function: 
1 1 
Hie ae 
; P ¢(2) 


Combine this with Theorem 6.4: 


a” 1 1 1 
=¢(2)=l+—a+—5+—+°", 
6 £2) 22 32 (4 
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and we have this remarkable result: 


Theorem 6.6. The probability that two integers chosen at random are rela- 
tively prime is 

6 
mn 


See https://mathworld. 
Note that 27/6 = 0.60792710... = 0.6, the experimental result proposed —_ wolfram.com/ 


above. Most people who hear this result (without proof) wonder what in the —RelativelyPrime-html 
beg. a8 for more detail about these 
world z has to do with it. Now you know. 


ideas. 
Exercises 
6.1 Consider the following recursively defined sequence of polynomials 
{T(k)} in Z[ x]: 
1 ifk =0, 
T(k) =4x ifk = 1, 
2xT(k-1)-T(k-2) ifk>1. 
For example, 
T(0)=1, T(1)=x, T(2)=2x?-1, T(3) =4x°-3x. 
(i) Calculate a few more of the (six, say) terms T(x) in the sequence 
and find some patterns in it. Prove your conjectures. 
(ii) The T(k) are formal polynomials in x, so you can substitute values 
for x and get identities about numbers. Show that for every real value 
of 6, 
T(n)(cos @) = cosné. This generalizes the 
high-school “double-angle 
(iii) Take It Further. How about a closed form for T? formula” for cos 20. 


6.2 Show that 


T 3 a 
[ cos” xcos3xdx=—. 
-1 4 
6.3 Prove the remaining orthogonality relations from this section: 
os T . 
(ii) {7 sinnxcosmx dx =0 for alln,m, 


oa ifn = # 0, 
(iii) /“) sinnx sinmx dx = . oe 
0 ifntm. 


6.4 Interpret the expression 


i 2 
[ (r)-(2 5 an cosms + by sins) dx 


m=1 


as an “average (mean) square distance.” 
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6.5 Using the notation of this section, show that 


(i) 


n 2 n 2 2 
>> Gm cosmx + by sinmx) =n Ya; + bin: 
m=1 


m=1 
(ii) 
am f f(x) cosmx dx = na?, and bm [ f (x) sin mx dx = xb?,. 
Tt = 7: 


6.6 Show that Bessel’s inequality (Lemma 6.3) implies Riemann’s lemma 
(Lemma 6.2). 


6.7 What is the probability that an integer picked at random has no perfect 
square factor? 


6.2 Dirichlet’s Theorem 
Let us state at once what we want to prove. 


Theorem 6.7 (Dirichlet, 1829). Let f(x) be continuous on [-n, 1] and dif- 
ferentiable at a point x9 € (-1, 2). Then 


co 
f (x0) = S + Yep cos nxo + b, sinnxo), 
n=! 


where 


1 nu 
an=— f f(x)cosnxdx, n=0,1,2,..., 


T 


1 ue 
bn=— f f(x)sinnx dx, n=1,2,.... 


Notice that we have assumed f(x) to be differentiable only at x = xo. It 


This is not a superficial can be quite nasty away from xg. In fact, the proof we give will show that 


statement, since the whether the series converges depends only on the behavior of f near xo. 
coefficients a, and by, are 
determined by the values 


of the functions on the Proof. To prove the result, form the number 
whole interval. 


n 
Sy = el xe D> (ax cos kxo + by sinkxo). 


k=1 
We must show that 
Sn > f(x0) asn— oo. 


Our first (and major) task is to express S, as a definite integral. Replace 
do, 41, ..-,4n, D1, b2,..., by in S;, by their values as definite integrals to obtain 


6.2 Dirichlet’s Theorem 


Sn = : — f° f (x)dx + ie ~ {" f(x)(cos kx cos kxo + sin kx sin kx) dx 


NY 
Y| 


i. f(x) {5 + > cos kx cos kxo + sin kx sin kxo} dx 


[Press Fanttc-n) dx 


1 r7 sin(n + 5 5)(x- xo) 
= x dx . 
[ FO) 2sin en <> ) 


Ale Ale 


This expresses the finite sum S,, of the series we are investigating as a definite 

integral involving f(x). Next we want to change the variable by replacing 

x—x9 with t. However, f is defined only on [—z, 7]. Extend f(x) periodically 

to R by defining f(x+2z) = f(x), so that that f now is periodic of period 27. 
On effecting the change of variable x — xo = t, we obtain 


1 1X0 sin(n+ 4)t 
si-- f oe waa - 2) dt, 
-T-Xo 5 


2sin 5 


which by periodicity is just the integral from —z to z. For the record: 


Sn -- [" flso+ jee», (6.9) 


2sin 5 


We need to show that S,, — f(xo) has limit zero. The idea is to absorb 
f (xo) under the integral sign and estimate the difference using Riemann’s 
lemma and the hypothesis of differentiability. The absorption is achieved by 
going back to our favorite identity: 


1 sin(n + 4)x 
= +COSX + COS 2x + +++ + COSNxX = ————— 
2 2 sin 5 


Integrating from —z to m gives 


™ sin(n + 5)t 


m+0+0+---+0= [ 
2 sin 5 


Multiplying by f (xo) (a constant!) and dividing by z shows that 


1 sin(n + 4 )t 
Xo) =- oe 
flw)= 5 [fo sar 


Subtracting f (xo) from our expression for S,, gives the lovely relation 


sin(n + 5 5)t 


dt . (6.10) 
eins 


Sy - Flx0)=— f” (Fx0#1) - F20)): 


Next, let g(t) = f(x0 +t) - f (x0). 
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This last equality is thanks 
to Lemma 6.5. 


Check that the entire 
integrand is now periodic 
of period 27. 
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We are ready to operate. Consider 


1 1 _ 2sin5 -t 


. t . t 
t 2 sin 5 2t sin 5 


(6.11) 


We apply |’ Hospital a few times to see the behavior of the right-hand side 
at 0: 


ee cos 5-1 sin 5 
Hee 2t sin £ “eet saa ost tan os 
2 3 3 7 7SINZ + GZ 


So, thanks to l’ Hospital, we see that 
1 1 


= = ——— 
t 2sin 5 


is continuous at 0. 
Thus, with some fancy footwork, we can write 


Sn(x) - f (x0) = (6.12) 
1 a sin(n+5)¢t 1 a 1 1 ' 1 
=f" (QQ at [8005 ~ ser) sin(me gy) ea 


But 


1 7 1 1 1 1 7 1 
- f g()| === sin(ns 5)tdr=— f H(1)sin(n+ >)rar, 
nu J-n t 2sin 5 2 uN J-n 2 


where #7 is continuous. So again by Riemann (Lemma 6.2), we have 


1 7 1 
lim - f H(1)sin(n+ >)rdt=0. 


noo 7T 
Next, passing to oo, we have 


1 i di 
firm (Sy(8) = /(20)) = Jim 2 (yy 


T 


= lim ae feo ¥) = £00) sin(n+ 5)ear, 


noo JT 


Here is where differentiability enters the action, because the differential quo- 
tient 


t 


is continuous at x9. And once again, Riemann comes to the rescue, so that (at 
last) 


him, Sn(x) — f(x0) =0. 


We did it! . 


6.2 Dirichlet’s Theorem 


Although I’ Hospital is convenient, it is useful (and fun) to derive Theo- 
rem 6.7 directly from the mean value theorem. Here are the details. 


Proof. Applying the mean value theorem to 2 sin 5 on [0,t] gives 


t 
on -O=tcosé, whereO<& <tr. 


Hence 


1 1 (1 1 )- eo & 


t 2sin$ t cos &; & tcos &, 


Now, (cos €; — 1)/t approaches 0 as €; > 0, as follows from the mean value 
theorem again applied to cosx, since €,/t < 1 and cosé, approaches | as 
& > 0. 

A double application of the mean value theorem replaces a double 
application of l’Hospital. Returning to S, — f (xo), we write 


Sn - f (x0) = : 1 f (x0 + 0 — f (xo) -sin(n + >) tdt (6.13) 
1 


2 frt0-+8) sa) (4 xt) -sin(n + S) rar 


The second integral approaches zero as n — oo by Riemann’s lemma. As 
for the first integral, we will see that its behavior is concentrated at 0. More 
precisely, let 6 > 0 be very small. Write the first integral of (6.13) as the sum 
of three integrals by splitting the interval of integration at t = —6 and t = 6. 
Away from 0, the denominator of f is harmless, and Riemann’s lemma tells 
us that two of the three integrals go to 0 as n > oo. In other words,whether 
the series S,, converges to f(x9) depends solely on the remaining integral 


- Pao ¥)~ Peo) ssin (+ 5) tar. 


If we assume that f(x) is differentiable at xo, then by definition, 


f (x0 +t) ~ f(x) 


t 


is continuous at 0, and a final application of Riemann’s lemma finishes the 
proof of Dirichlet’s theorem. | 


Lookout Point(s) 6.3. It turns out that without some restriction on the 
behavior of f(x) at xo (i.e., of f(x +f) at t = 0), one cannot prove that 
Sn > f (xo). In 1910, Lipét Fejér (1880-1959) produced an example of a con- 
tinuous function for which the sequence S,,(xo) is unbounded. The assump- 
tion of differentiability at xo is stronger than needed, but it illustrates the 
method and suffices for many applications. 

Dirichlet’s paper in which he establishes this result leaves nothing to be 
desired in terms of modern-day rigor. An examination of the proof shows that 
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After all, how does one 
prove |’ Hospital!? 
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It’s a good exercise to 
carry out the translation. 
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Figure 6.1. There can be cusps in the graph of y = f(x). 


it still goes through if we assume f(x) to have only left and right derivatives 
at xo. It is only necessary to break the integral fie f (x) dx into i f(x) dx+ 
in f (x) dx and use Riemann’s lemma on each integral, observing that the 
existence of a left and right derivative gives continuity at the critical endpoint. 
Thus the situation illustrated in Figure 6.1 is allowed to happen. 

Furthermore, one can handle the case of a simple jump discontinuity as in 
Figure 6.2. For this we assume that left- and right-hand derivatives 


fig SOP OI sad 


t>0+ t t>0- 


am 1 6t0+2) - FO) 
t 


exist. For you can just push the pieces together and use the proof above. The 
result is that S,,(xo) approaches the average 


f (x07) + f(xo_) 
aa 


Figure 6.2. There can be jumps. 


Finally, we mention that due to the fact that f has been extended to have 
period 27, the limits —z and z can be replaced by a and a+2z. In other words, 
it is just the length of the interval that matters. Often you will see integrals 
from —z to z replaced with the same integrand with limits from 0 to 27. Thus 
we can state a corollary. 


Corollary 6.8. If f(x) is defined on [0,27] and differentiable at xo € (0,27), 


then 


co 


f (x0) = 7 +) (az coskxo + by sinkxo), 
k=l 
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where 


1 u 
an=— f f(x)cosnxdx, n=0,1,2,..., 
NM I-n 
1 nu 
bn=— f f(x)sinnxdx, n=1,2,.... 
lM J-n 
= a 20 3x 


Figure 6.3. Endpoint discontinuities at odd multiples of z. 


Another thing: if f is continuous on [-z, 7], its periodic extension need 
not be continuous at an endpoint of the interval (see Figure 6.3). However, the 
above observations show that if we shift the interval of integration so that the 
jump discontinuity falls inside, then we may conclude that the Fourier series 
at x = 7 is P(x) fon) when the graph of f has left and right derivatives at z. 
We shall use this observation later. 

And one more thing: In our evaluation of Gauss sums we need to work on 
[0,1], which doesn’t have length 27. But if f(x) is defined on [0,1], then 
f (;) is defined on [0,27]. Applying the theorem on [0, 27] and simplifying 
gives the following normalized result. 


Theorem 6.9 (Theorem 6.7, version 2). [f f(x) is continuous on [0,1] and 
differentiable there, then for all x in (0,1), one has 


f(x)= S + )\ (ax cos 2akx + By sin2xkx), 
k=l 


where 
1 
a =2 f f(t)cos2nktdt, k=0,1,..., 
0 
1 
A. =2 f f(t)sin2nktdt, k=1,2..... 
0 


Furthermore, when f (x) has a right derivative at 0 and a left derivative at 1, 
one has 


ce SP LOESION 
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Doing the first thing will 
show up in the exercises. 


And how would you define 
an even function? How 
about an example of each 
kind? See Exercises 6.8 
and 6.9 for more about 
even and odd functions. 


This is Exercise 6.10. 


Chapter 6 Fourier Series and Gauss Sums 


6.3 Applications to Numerical Series 


One thing to do with Dirichlet’s theorem is to calculate the Fourier series for 
some familiar functions. Another is to find a function with a given series. In 
this section, we will do the second thing and end up with some very pretty 
identities. 


Example 1. One of the earliest series investigated was 


sin2x  sin3x 
+ 
3 


sin x + 


For this, consider (of course) the function f defined by f(x) = *5* on [0,7]. 
In order to apply Dirichlet’s theorem (Theorem 6.7), we need to extend its 
domain to [-7, 7], so let’s consider (again, of course) the function f that is 


“>~ on [0,7] and = on [-z,0). Its graph is pictured in Figure 6.4. 


ELE 
2 


els 
2 


Figure 6.4. f(x) = 75* on [0,2] and —4—~ on [-z, 0). 


We have, by construction, 


f(x) =-f(-x) for x € [0,7]. 


Such a function is called odd. All the “even” Fourier coefficients for an odd 
function are zero, since 


an ~ [ Fex)cosne ar = ~ [-” f(-x)eosn(-x) d(-x) 


Tw 


-n 1 
=f f(x)cosnx dx =-+ f f(x) cos nx dx = -dy. 
1 7 J= 


7 Tt 


Thus 2a, = 0, which implies a, = 0. Furthermore, we can simplify calculation 
of the “odd” coefficients by showing in the same way that 


by = c [OF sinnx dx. 
0 


T 


On [0,7], we have f(x) = *5*. Therefore, 


6.3 Applications to Numerical Series 


2 Tw = 
ba= = f (7 *) sins dx 
n Jo 2 
2 fn ty 
-=f *sinnxdx-— [ x sinnx dx 
xmJjJo 2 nx JO 


TO lf -xcosnx|* 1 7 
— sinnx dx + cosnx dx 
0 nt n o nJo 


1 nt ~1)r 
= ——cosnx a ) +0 
n 0 n 
ey Pere ee +0=-. 
n n n n 


Applying Dirichlet’s theorem gives us the following result. 


Theorem 6.10. For x € (0,7), 


* = sinx + 5 ees (6.14) 


(6.15) 
which is our friend the Leibniz—Gregory series. 


Example 2. As a second application, consider the function f defined by 


a 7 ifx (0,7), 
FC =f if x € (-7,0), 


4 

and then further defined indefinitely in both directions so that it has period 27. 
Since the Fourier series insists on the value 0 at the origin, we might as well 
put f(0) = 0. Hence the graph of y = f(x) is as in Figure 6.5. 


1 
oe SC =O 
4 
r + 
—2n —1 T 2n 
se ceo 


Figure 6.5. An odd function. 


See Exercise 6.11. 
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This expression for 


might require a “little 
think” in order to convince 
yourself that it is true. 


This is Exercise 6.12. 
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Since f is odd, the “even” coefficients az, are all zero, and 
2 7 2 ("72% 1 7 
by = =f f (x) sinnxdx = =f — sinnx dx = =f sinnx dx 
0° zaJjJo 4 2 JO 


1 | , 
5 (tag (=), 


cos nx |* 


n 


which is equal to + if n is odd, and 0 if 7 is even. 


Hence if x € (0,7), then 


1 : sin3x  sinS5x 
= sinx + + 
5 
Putting x = $ gives 
T 1 1 1 1 1 1 1 
1+ + 
3 7 Il 13 17) «19 23 


Take It Further 
Here is another pretty formula. For x € [0,7], 


$4 2° 9 * § 

For the proof, just calculate the Fourier series of the left-hand side. You 
might ask how one invents the left-hand side. Well, it is linear in x. But x 
is an odd function, and the right-hand side is even. So any Fourier series 
experimenter who tried |x , which is x made even, will arrive at the desired 
series. You should make x” odd and see what happens. Incidentally, putting 
x = 0 gives once again 


T MX  cosx cos3x cos5x 
+ 


Example 3. As a third application of Fourier series we will see how the 
Fourier series for certain functions that depend on a parameter give non- 
trigonometric expansions for other functions. The example we use here is 
cos ax, where a is a fixed real number, but not an integer. If you calculate the 
Fourier series on [-7, 7], you get 

sinaz | 1 2a 


a 
cos ax = 55 COS X + =—; COS 2x 
1 a a-l a-—2 


Putting x = 0 and viewing a as a variable, we have 


T 1 2a 2a 2a 
sinax a a?-1 a’?-4 a@-9 


In other words, on replacing a by x, we have the result that if x ¢ Z, then 
m csc(7x) 1 1 1 1 
2x 2x2 x2?-1 x2-4 x?-9 
which yields an infinite partial fraction expression for for the cosecant func- 
tion csc(zx). 


6.4 
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Exercises 


6.8 


6.9 


6.10 


6.11 


6.12 
6.13 


6.14 


Show that every real-valued function is the sum of an even function and 
an odd function. 


What polynomial functions are even functions? Odd functions? Prove 
what you state. 


Show that for an odd function, the odd Fourier coefficients b, are given 
by 
2 7 : 
by = = ; f (x) sinnx dx. 
nm JO 


Show that sin nz/2 is a Dirichlet character modulo 4. In other words, 


0 ifn=0 (mod 4), 
nx |0  ifn=2 (mod 4), 
1 ifn=1 (mod 4), 
-1 ifn=3 (mod 4). 


Calculate the Fourier series for f(x) = |x|. 


Let f and g be continuous on [-1, 1] and differentiable on (0,1). Show 
that the Fourier coefficients of a linear combination of f and g, say 
cf + dg, where c and d are real numbers, are the corresponding linear 
combinations of the Fourier coefficients of f and g. 

Take It Further. Calculate the Fourier coefficients on [—z, ] for f.(x)= 
|x|", n = 1,2,3,4. State and prove any regularity (in terms of 7) that you 
find in the formulas. 


6.4 Gauss Sums 


In Chapter 2, we met a certain sum of roots of unity called a Gauss sum. A 
special case came up in the construction of a regular pentagon. Recall what a See equation (2.10) in 


Gauss sum is: if € = cos 2m +isin an = @7i/" then 


Section 2.2. 


2 Note that G; = 1. We will 
Gia14feC eC seg? P use this in a bit. 


Gauss succeeded in evaluating G,,. The answer is rather amazing. One has 


(1+i)/n_ ifn=0(4), 
_|vn ifn = 1 (4), 
~ lifn ifn = 3 (4), 

0 ifn =2 (4). 


Gh 


This remarkable result has been proved many times over. The proof we 
give here is due to Dirichlet and amounts to an ingenious application of a 
Fourier series. More precisely, the evaluation of G,, will be achieved by eval- 
uating f° cos(x*) dx and f° sin(x?) dx in two different ways, coupled with 
the Fourier series for cos(x) and sin(x*). Here we go... 
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It’s left to you as 
Exercise 6.15. 


Chapter 6 Fourier Series and Gauss Sums 


6.4.1 A Brief Review of Infinite Integrals 


We need some facts about real-valued functions f defined on [a, co). Suppose 
bal f(x) dx exists for each N > a. If limy—oo i f(x) dx exists, then we 
denote it by . f (x) dx. The only lemma we need concerns integrals of 
the type [,~ f(x) cosxdx and [,” f(x) sinnx dx, where f(x) is a positive 
monotonically decreasing function with limit zero as x > oo. We prove the 
lemma for the sine integral and leave to you the other case. 


Lemma 6.11. Jf f(x) is integrable on [a,N) for every N > a and if f(x) is 
positive and monotonically decreasing to 0. as x — 0, then I f(x) sin x dx 
exists. 


Proof. Since f(x) > 0 and sin x alternates sign, the graph of y = f(x) sinx 
is as in Figure 6.6. 


Figure 6.6. y = f(x) sin(x). 
We must show, by definition, that 


N 
lim f(x) sin x dx 


N-0oo 


exists. For that, divide [a, co) as pictured in Figure 6.7. 
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(nt+1)a 


Figure 6.7. A partition of [a, oo). 


That is, moz is the first multiple of z after a, and N is chosen such that The Archimedean prop- 


nt<N< (n+ 1)x Then erty of R raises its head 
~ . again. 


N mon (mo+1)x 
7 f(x) sinxdx = f f(x) sinxdx + f f(x) sin x dx 


a a Mot 


cae f(a)sinxdx+ [ f(x) sinxde. 


n-1)x 


Convince yourself that since f(x) is positive and monotonically decreasing, 
we have 


N (n+1)x 
i f(x) sin x dx < f(nn) f |sinx|dx=2f(nm). (6.16) 
Hence (using equation (6.16)), we have 
(k+l) (k+1)xa 
f[ f(x) sinx dx ay f (x)| sin x| dx 
kn kn 
kn 
<2f (km) < iH in x dx| . 
f (kx) ane f (x) sin x dx 
So by Leibniz’s result on alternating series whose nth term goes to zero, We —_Leibniz’s result on 
see that alternating series is also 
known as the alternating 
(mo+1) nn series test. 
f[ flaysine dxt—+ f f(x) sin x dx 
mon n-1)az 


converges. Hence 
N 
lim f (x) sin x dx 
N->-w Ja ‘ 


exists. That is the same as saying that i. f (x) sin x dx exists, as promised. 
a 
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Let u = x2, so that 


_ du 
eal Are 


Moving up to C will make 
the calculations simpler 
and highlight the essential 
ideas in the proof. 
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Corollary 6.12. J+ follows that the integrals 


eS 1 co sin x 
: 2 
dx =- 7) 
[ sin (x ) X , Vx X 


and 


3 
oO 
fe) 
7) 
— 
ta 
i) 
—, 
Q 
ta 
ll 
Nie 


°° COS x 
i dx 
0 fx 


exist. 


6.4.2 Using Complex Numbers 


Next, we up the ante a little and look at functions defined on R and taking 
values in C. If f(x) is a complex-valued function of a real variable x, then one 
may define [.” f(x) dx = [? fi(x) dx+i [? fh(x) dx, where fi(x)+ifo(x) = 
f(x). We say that f is differentiable if f; and fo are. In this way, all of our 
results on Fourier series carry over formally, and we can state the following 
theorem. 


Theorem 6.13. /f f(x) is complex-valued on (0, 1] and differentiable at xo € 
(0,1), then 


f(x) = 2 


co 
= + ye (a, cos 2anx9 + Bn sin27nx0), 


n=1 


where 


1 
on =2 f f(t)cos2nntdt, n=0,1,..., 


1 
pu=2 f f(t)sin2antdt, n=1,2,.... 


Furthermore, if f(x) has a right-hand limit at 0 and a left-hand limit at 1, 
then 


0 F oy = LO)# F0) 


n=1 


If we write expt = e’ and define f(x) by the formula 


Qnix? nx 7 nx 
f(x) := exp = cos +isin , 
n n n 


then f(1) = %, and f(j) = ee when j is an integer. So the Gauss sum G,, is 
simply f(0) + f(1) +---+ f(m—1). 


6.4.3 The Value of the Gauss Sum 


We are now ready to compute G,,. Recall that if f(x) is differentiable and 
complex-valued on [0, 1], then we have 


2S ay = LOFT | (6.17) 


6.4 Gauss Sums 


where Qo, @},... are the even “normalized” Fourier coefficients for the inter- 
val [0, 1]: 


1 
On -2 f(t) cos 2ant dt. 


If we extend the definitions of a, and 6, to negative n, we can symmetrize 
the above identity. For if m > 0, then 


ay =2 f f(t) cos 2a(-n)t dt = an, 
1 
Ben -2 f(t) sin2a(-n)t dt = —Bp. 


Hence 


N N 
» (Qn +iBn) =an+2 >> an. 


n=—N n=1 
But 


1 
Qn +iBy =2 y f(t) (cos 2ant + i sin 2nt) dt 
0 


=2 [ f(t)exp(2nint) at 


Hence on summing from —N to N, we have 


N N 1 
¥* (Gn +iBn) =2 > Fi f(t) exp(2nirt) dt 
n=—-N oer 


It follows that 
> + 2D, An = > =) f (t) exp(2zirt) dt 
n=l 


But the limit as N > oo of the left-hand side is LO py equation (6.17), 
and we end up with the beautiful formula 


FO) FO) lim > SL f(t) exp 2zirt dt. 


N->oo i= 


Next, consider our function on the interval [j, j + 1]. Then just as for the 
[0, 1] Fourier theorem, we have 


PU)4 FU41) 
2 


N jtl 
lim yo i f(t) exp (2nirt) dt. (6.18) 
J 


Noo r=—-N 


So we consider f(x) = exp (22 ) on [0,1], [1,2],...,[—-2,n—- 1] and 
sum (6.18) on j from 0 to n — 1. The left-hand side is 

1 

(FO) + FC) + FC) + F(Z) + F(2) + FB) +--+ F(n— 1) + fn) 


=f(0)+fCl)+---+f(m-1) (since f(0) = f(x) 
=Gn, 
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t = nt',dt = ndt’, 


t / 
aw eae 
n 


If you’ve hung on so far, 
keep going. The rest will 
be worth the effort. 
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while the right-hand side is the unsightly mess 


N 2 
n 2nit 
lim > f[ ex 
Noo r=—N 0 ‘ ( 


The change of variable t = nt’ changes the limit of integration, and the expres- 
sion becomes 


+ 2nirt dt. 


2 


; N. | 2nin*t! 
oe 2, i ep n 
N 1 
=lim Yon f[ exp (2min(t” +rt’)) dt’. 


Nene r=—N 


+ 2nirn' dt’ 


Dropping the ’ decoration, this becomes 


n=n Jim, > is exp ( (2zin( ( * +rnt)) dt. (6.19) 


So far, so good. Now we operate on |e exp(27ix7) dx. Call its value T. 
This integral is simply 2 oa exp(27ix7) dx. However, we symmetrize around 
zero to fit the earlier discussion. Then 


fo) N 
[. exp(2mix”) dx = im, a exp(2mix") dx. 


Change the variable by setting x = \/ny. Then 
TS i exp(2mix*) = lim va fe exp(2miny”) dy 
oo N- oo —/nN 
= Va tim [ . exp(2miny”) dy, (6.20) 


since replacing \/nN by N doesn’t change the limit. (We already know that it 
exists!) 

To keep notation down, let h(y) = exp(2miny”). Now the idea (due to 
Dirichlet) is to compute the limit in two ways, breaking up the interval of 
integration at integers: 


[. h(y) dy = [-0) aye fo aay ef h(y) dy, 


(6.21) 
and then by half-integers, considering the partial sum: 


ves -v+4 -vs} wed 
Sh(yydy= f  h)aye fo h)ayee-+ [* aQyyay. 
i (y) dy = pe (y) dy ve (y) dy = (y) dy 
(6.22) 

Changing the integration interval to [0, 1] in 6.20 gives 


N 
T=VJn im > f "exp (2min (x? +2kx)) dx. (6.23) 
°° k=-N 


6.4 Gauss Sums 


On the sum by half-integers, change the variable again, referring every- 
thing to [0,1]. A typical term in the sum is 


k+4 
[. exp (2ziny”) dy. 


The substitution y = k — 5 + x transforms the integral to 


1 1 2 
rf exp (2nin(k~ 5 +x) Jax 
0 2 
: _ (42 1 
=f exp (2xin(k ~k4 74 (2k-1)x4x°)) dx 
1 
-exp(=") [ exp (2min(x* + (2k - 1)x) dx. 
0 


However, exp (242 omit ) =i". Hence equation (6.22) becomes 


N 
T=Vni" lim > Jf exp (2nin (x2 +(2k-1)x)) dx. (6.24) 


N->oo k--N 


Let us now examine (6.23) and (6.24) (repeated here for reference): 


T=V/n lim > ae exp (27min (x* + 2kx)) dx, (6.23) 


N-00 1 


T= Jai" lim > ie exp (2min(x?+(2k-1)x)) dx. (6.24) 
700 7 N 
In (6.23), we have a sum on k from —N to N with 2k appearing under the 
integral, and in (6.24) we have the sum over the odd integers 2k — 1. In other 
words, we have summed i, exp(2min(x? +.sx))dx for all s from -2N to 2N. 
That is to say, dividing (6.23) by \/n and (6.24) by \/ni” and adding gives 
(replacing 2N by N, of course) 


T -”) 
re = lim 3 f exp ( (2zin ( (x *+sx)) dx ; 


N00 5 Nn 


Comparing this with equation (6.19), we have the remarkable result, and 
the aim of the entire investigation, 


= (147) = 


Gp 
n 
This gives 


Gn=VJ/nT(1+i"). (6.25) 


The magic isn’t over yet. In this relation, let n = 1. Since G; = 1 (and V1 = 1), 
we have 
1 1+i 
T = — = —_. 
1l-i 2 
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Let u = x2, so that 
dx = 


2Ju 
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That is, 
oe 1 1 
Qnix*) dx = — + <i. 
J exp (2nix?) dx = 5 + 51 
Equating the real and imaginary parts gives 


tae cos (22x7) dx = 5 


lone} 


and 


Changing variables gives 


°° COS x Tt 
=,/x- 2 
ha NG a. 


and 
i ee he af (6.27) 


However, knowing 


gives (at last) the value of the Gauss sum G,: 


Theorem 6.14 (The value of the Gauss sum). The Gauss sum 


Gy = Vw PEDO (6.28) 


can be simplified depending on the value of n modulo 4: 


(i) Ifn =0 mod 4, then 
Gy = Vi UO) = aca si). 


(ii) Ifn=1 mod 4, then 


ores wees eed _ Vi. 


(iii) [fn =3 mod 4, then 
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In particular, if n is a prime p, we have the famous result of Gauss that we 
cited earlier in Section 2.2: 


Corollary 6.15. /f ¢ = cos (+) +isin (+), we have 
402 if p = 1(4), 

itfe el eee 7? = vP : 
ijp ifp=3(A). 


Exercises 


6.15 Show that if f(x)is integrable on [a,N) for every N > a and if f(x) 
is positive and monotonically decreasing to 0 as x > oo, then Le f(x) 
cos x dx exists. 


co sint co §6sinkx 
6.5 On f° dt and yr, 
We have seen that a differentiable function defined on [-z,2] may be 
expressed as a trigonometric series (its Fourier series). The proof depended on 
our favorite trigonometric identity ((6.30) below) and a lemma of Riemann’s 
(Lemma 6.2). This technique was already used in the proof of the identity 


eS =i1t m3 + x +--+ that we gave in Section 6.1. And in the preceding section, 
using a somewhat more involved argument, we saw that Fourier series gave 


the value of the Gauss sum and the value of the integrals 


© COs x °° sin x 
fi dx and —— dx. 
0 Sx 0 Vx 


The use of trigonometric series can also be used to evaluate in sine dx, 
and that is what we take up here to close out the chapter. The proof is an 
elegant application of all the methods developed in this chapter, and it shows 
simultaneously that for x € (0, 7], 


m-X : sin2x  sin3x 
—— =sinx + 
2, 


(6.29) 


and 


Of course, the infinite series (6.29) is, by Theorem 6.10, the Fourier series 
of _ made odd on [~z, 7]. However, we shall not assume that fact. We will 
use a simple case of Riemann’s lemma and the fact that is sine dx exists. 
The existence of iy sink dx is a special case of Lemma 6.11, proved in the Lemma 6.11 says that 


er : Io” f(x) sin x dx exists 
previous section. The proof then goes as follows ... when (x) is positive ond 


Beginning (of course) with the identity monotonically decreasing 
to zero aS X > oo. 
: 1 
sin(n + 5)t 


1 
q PS08t + cost +++ Cosme = ; (6.30) 


. t 
2 sin 5 
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We met this identity 
before—equation (6.15). 
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we integrate both sides from 0 to x for 0 < x < z. That gives 


.; 1 
x, sin2x sin3x sinnx x sin(n+ 5)t 
~ + sinx + + - 

0 


5 5 + 3 fees - dt. (6.31) 


. t 
2sin 5 


Think of x as fixed on (0, 2]. Write 
. 1 
x sin(n+5)t 
7 ( 2) dt 
0 2 sin 5 


1 


x x sin(n+s5)t 
= f[ : : ; sin(n+ >)ear+ [ aa 
0 t 2sin 2 0 t 


2 


For fixed x, the first integral goes to zero as n > oo by Riemann’s lemma. 
We need to know, of course, that 


1 1 

¢ 2sin§ 
is well behaved at the origin, a point we established in Section 6.2 with a 
double application of l’Hospital. As for the second integral, we change the 
variable by writing 

ae 
n+—|t=é&. 
2 


x sj +1) (nt+4)x of 
a aera d= [BE ae, 
0 t 0 é 


But this integral approaches ie = dé as n — oo, since x + 0. It follows 


that (6.31) has a limit as n > oo, and that limit is i ont dt. In other words, 


Then 


ee ee 
oy ee fat ford<xen. 
2 ay Ck 0 t 


And now (more magic) ... let x = m (note that x = 0 is forbidden). We obtain 
the following lovely result. 


Theorem 6.16. 


1 °° sint 
za el gp 
2 0 t 


And there’s even more: On substituting back, we have 


forO<x<z. 


We can’t resist then putting x = 7/2 to obtain (again) 
1 1 1 1 1 
+ + 
4 3. 5 7 9 


Thus the formal analogy between the “continuous” sum he ont dt and the 


“discrete” sum 37° aus seems more than merely formal. 


Epilogue 


Looking Back 


This text involves immersing oneself in two faces of the discipline: 


(i) working on hard problems, accompanied by reflection on the habits of 
mind you use or develop during this process; 


(ii) studying the work of others, trying to figure out how they might have 
been thinking when developing their methods and results. 


There are many wonderful texts that address both faces of this Janus head, 
but the 1972 course by Ken Ireland was my first encounter with a design 
that integrates these two ways of doing mathematics in what might seem like 
a reversal of the customary order—what I called in the preface “experience 
before formality.” 

Experience before formality became a foundation not only for my own 
mathematical work (where it is the typical way new results emerge), but also 
for my approach to teaching mathematics. It is quite difficult to explain how 
effective this teaching practice is to someone who hasn’t experienced it as a 
student, but you have just been through a text that is based on this principle. 
Take some time to reflect on what you have done. Think about how it would 
play out in your future work in mathematics. Think about how it would play 
out in your teaching. This is beginning to sound preachy, so I’1l make only 
one suggestion: Just try it. 

One way to think about the results developed in these chapters is a theme 
I mentioned in the preface: in high school, and, to some extent, in undergrad- 
uate courses, many results of fundamental importance in the history of math- 
ematics are stated without proof. There are good reasons for this: Many of 
the proofs are quite technical (as you have just seen) and involve background 
that may not be current (or may not exist) for some students. Developing the 
prerequisite knowledge would take more time than a syllabus allows. But the 
proof of a result does more than establish a fact; it gives you a sense of why 
the fact was of interest to mathematicians in the first place, and it helps you 
to make teaching decisions about what to emphasize, what examples to use, 
and where the result fits into the overall mathematical landscape. 
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Another instance of why 
experience precedes 
formality .... 
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You can’t construct a regular heptagon with straightedge and compass; the 
fundamental theorem of algebra requires the odd-degree root theorem; e and 
m are transcendental; a = vei a every prime congruent to | mod 4 can 
be written as the sum of two squares; there is a formula for the number of 
representations of an integer as a sum of two squares .... All these and more 
appear somewhere in high-school and college programs, and one of the major 
goals of this book is to “mop up’”—to fill in the details of why these things 
are true. 

Many of the proofs of these central results that we present here have been 
revisited and polished for generations, so you are seeing a kind of finished 
product. But the elegance of the proofs are the capstones for long periods 
(sometimes years or decades) of intense work. Following a long proof often 
gives you the feeling of “I get it, but how would someone come up with that?” 

A good example appears in Chapter 5, where the proofs of the main results 
follow a recognizable rhythm, each starting with a function we call the “Her- 
mite beast.” How in the world did someone come up with these? That’s where 
some of the Dialing In problems come in. They provide ideas about, for exam- 

See, for example, Dialing ple, how one might conceive of an expression that is a positive integer less 
i es 114, 115, and than 1 and hence can’t exist, negating the assumption that the number in ques- 
tion is rational or algebraic. 


Looking Forward 


What next? I have some suggestions. 


(i) Join a professional organization. These make available many useful 
resources for mathematical work: 


(a) The American Mathematical Society (AMS.org) is aimed at the math- 
ematics research community. 


(b) The Mathematical Association of America (MAA.org) serves the 
mathematics teaching community, mainly undergraduate and high 


school. 
(c) The National Council of Teachers of Mathematics (NCTM.org) is 
NCTM features a mix of primarily for PK—12 mathematics teachers. 
pedagogy and mathemati- 
cal activities. Each of these organizations supports local, regional, and national con- 
ferences. Membership will keep you informed of when and where these 
take place, and, with a little careful curating, you will find in them some 
wonderful mathematics and very useful ideas about teaching. 
Just suggestions: Exer- (ii) Present your work to the field. Getting ideas out there is becoming eas- 
cise 2.40, and Dialing In : : : : ot 
problems 22, 43, 95, 126, jer all the time. Outlets include widely read blogs (each of the organiza 
all make launchpads for tions above has at least one), journals (electronic and paper), presenting 
interesting MAA, AMS, at conferences (like the ones mentioned above), and good old paper texts 


r NCTM papers. ee ‘chi 
or Papers (like this one). Publishing or presenting serves many purposes, but one 


that is often overlooked is that it helps you clarify ideas and make new 
connections for yourself. Many of the problems and exercises in this 
book make ideal loci for a journal article or a blog post. 


Epilogue 


(iii) Join a community of mathematical practice. A mathematical inves- 
tigation usually begins in solitude—you read or hear about something 
intriguing and you dive in, all by yourself, doodling, thinking, experi- 
menting, and trying things. But when the insight comes, it really helps 
to brainstorm with friends and colleagues. Many teachers have formed 
informal study groups that meet regularly to work on specific topics. 
Many departments sponsor regular seminars. And there are national sites 
of mathematical practice. Two in particular use the “experience first” 
design: 


(a) The Park City Mathematics Institute [82] offers summer programs 
for all parts of our community (mathematics research, undergraduate 
teaching, precollege teaching ...). 


(b) PROMYS at Boston University [81] is a longstanding program for 
advanced secondary-school students and practicing secondary-school 
teachers. The program is explicitly designed to give participants the 
experience of working as a mathematician. 


These are just examples. The point is that staying connected with oth- 
ers who have similar mathematical dispositions greatly enhances your 
mathematical experience. 


The 1972 version of the last Dialing In problem ended with a quotation 
from the Looney Tunes cartoons: 
‘ pyw . 


olen 


x 71Aad, let G he O fructe x aaa d ¢ nde n= py 
oN tot F H, ow Hr oR Aud Gn rjar ra 
p Hy 32 rEG ouch Hat Add = H. d 


Thod's atl, Feths. 


Gi Edi he 


True to form, that was a joke; I sincerely hope that you keep doing and 
teaching mathematics in the style and spirit of this book. 

—Al Cuoco 

May 29, 2022 
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